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(57) ABSTRACT 
The invention relates to the use of proteins and peptides 
coded by the genome of the isolated or purified strain of 
severe acute respiratory syndrome (SARS)-associated coro- 
navirus. resulting from sample reference number 031589 
and, in particular, to the use of protein S and the derivative 
antibodies thereof as diagnostic reagents and as a vaccine. 
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FIGURE 10a 
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FIGURE 10b 



Patent Application Publication Nov. 29, 2007 Sheet 12 of 116 US 2007/0275002 Al 




FIGURE 11 



Patent Application Publication Nov. 29, 2007 Sheet 13 of 116 US 2007/0275002 Al 




Patent Application Publication Nov. 29, 2007 Sheet 14 of 116 US 2007/0275002 Al 



>< XhoII 

X ScrFI x sau3AI 

X Mval > < TthHBBl >< Ndell 

>< EcoRII > < TaqI >< Mfll 

x Ecll36l >< Sau3AI >< Mbol 

>< DsaV >< Ndell >< Dpnll 

x BstOI >< MboIX HnllX Dpnl 

X BstNI x Dpnll x BstYI 

X BsiLI X Dpnl X BspAI 

X BsaJI x BspAI x Bspl43I 

X Apyl x Bspl43IX Bglll 
ATATTAGGTT TTTACCTACC CAGGAAAAGC CAACCAACCT CGATCTCTTG TAGATCTGTT CTCTAAACGA 

10 20 30 40 50 60 70 

X Vnel 
X SphI 

X Snol 
X Rmal 
>< Pael x Sdul 
>< Nspl x NsplI 
X NspHI X HgiAI 
>< Nlalll X Bspl286I 
x Mae I x Brayl 
x Tru9I >< Apan 

x Msel >< Bbvl x Alw44I 

x Dral x Alul > < Fnu4HI x Alw21I 

ACTTTAAAAT CTGTGTAGCT GTCGCTCGGC TGCATGCCTA GTGCACCTAC GCAGTATAAA CAATAATAAA 
80 90 100 HO 120 130 140 

X Sfd 

X PstI 
X Hnll 
X Ksp632I 

X Hindi! > < MboII x Earl 

x Hindi x Maelll x Eamll04I 

TTTTACTGTC GTTGACAAGA AACGAGTAAC TCGTCCCTCT TCTGCAGACT GCTTACGGTT TCGTCCGTGT 
150 160 170 180 190 200 210 

X TthHB8I x Styl 

x TaqI x Rmal x ScrFI 

X Sau3AI x Mael X Neil 
x Ndell x ECOT14I X Mspl 

x Mbol x Ecol30I x Maelll 

x Dpnll x BssTlI X Hpall 

x Dpnl x BsaJI >< HapII 

x BspAI X Blnl X DsaV 

x Bspl43I x Avrll X Bcnl 
TGCAGTCGAT CATCAGCATA CCTAGGTTTC GTCCGGGTGT GACCGAAAGG TAAGATGGAG AGCCTTGTTC 

220 230 240 250 260 270 280 

>< Rmal 
x Esp3I x Maell 
x Hindll >< MaeII> < Eco57I x BsmAI >< Mael 

x Hindi > < AflHI > < Ddel x Alw26I x BsmBI 

TTGGTGTCAA CGAGAAAACA CACGTCCAAC TCAGTTTGCC TGTCCTTCAG GTTAGAGACG TGCTAGTGCG 
290 30O 310 320 330 340 350 



FIGURE 13.1 
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>< Sau96I 
>< PssI 
>< Pail 
>< NspIV 
>< Mnll 
>< Haelll 
X Eco0109I 

>< Dralix MboII x Prall 
>< Mnll X Cfrl3I x PmaCI 

X Ksp632I x BsuRI > < Maell 

x Hinfl X BsiZIX EcoNI X Eco72I 

. X Earl X BshI x BslI >< BsaAI 

x Plel x Earall04Ix Asul x BsiYIx BbrPI x Mnll 

TGGCTTCGGG GACTCTGTGG AAGAGGCCCT ATCGGAGGCA CGTGAACACC TCAAAAATGG CACTTGTGGT 
360 370 380 390 400 410 420 

X Tru9I 

x Rsal x SfaNI 

x Rmal x CspGI >< BspWl >< Msel 

x Mael x Alul x Afal x Alul > < Maell 

CTAGTAGAGC TGGAAAAAGG CGTACTGCCC CAGCTTGAAC AGCCCTATGT GTTCATTAAA CGTTCTGATG 
430 440 450 460 470 480 490 

X Pall 

x Haelll >< Rs aI 

X Tru9I x Gdill Mcrl x 

x Msel x Eael >< Csp6I 

X Esp4I X BsuRI >< Bsml BsiEI X 

X Aflll X BshI X Alul X BscCI X Afal 



>< Nspl 
x Seal x NspHI 

x Rsal x Main 

> < Csp6I x BslI x MboII 

>< BsrI >< BsiYI >< MboII 

X Acll — X ATaT >rHm'r X Muni X Acil 

TAGCGGTATA ACACTGGGAG TACTCGTGCC ACATGTGGGC GAAACCCCAA TTGCATACCG CAATGTTCTT 
570 580 590 600 610 620 630 

X TthHB8I 
X TaqI 
X Sau3AI 
X Ndell 
>< Mbol 
X DpnII 

> < Dpnl 
X Clal 
X Bsul5I 
X BspDI 

X NlalV >< BspAI 

x Mspl > < Bspl43I 

x Hpall x Bspl06I 

X HapII >< BsiXI Maelll > 

x CfrlOI x BscIX SfaNI Ddel X 

x BscBI X Alul X Banlll Bfrl X 

CTTCGTAAGA ACGGTAATAA GGGAGCCGGT GGTCATAGCT ATGGCATCGA TCTAAAGTCT TATGACTTAG 
640 650 660 670 6B0 690 700 



FIGURE 13.2 
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>< Sau3AI 
X Ndell 
>< Mbol 

>K K P hI Vnel >< 

°P nIr Snol X 

, >J BSPAI > < Nlalll 

X Alwlx Dpnl >< Ddel . >K 

X Alul X Bspl43I >< MboXI X Bsri Alw44I X 

GTGACGAGCT TGGCACTGAT CCCATTGAAG ATTATGAACA AAACTGGAAC ACTAAGCATG GCAGTGGTGC 

710 7 20 730 740 750 760 . 770 

X SstI 
X Sdul 
x Sad 
X NspII 
x Mnll 
X HgiAI 
x Eco24I 
X Ecll36II 
X Bspl286I 
X BmyrI 
. x Banll 

X Alw21I 
X Alul 



x Sdul 

X NspII 
X HgiAI 

x Drain 
X Bspl286I 
X Bmyl 
X Alw211 



X TthHB8I 
>< Taql 

> < Salt 

> < Rtrl 
X Hindll 
x Hindi 

x Bsgl 
x Acd 



Sau96I x 
Pall x 
NspIV >< 
Haelll X 
Cfrl3I x 
BsuRI x 
Bsizi x 

BshI x 
Asul : 



„„ pweni »< ACCI Asul X 

ACTCCGTGAA CTCACTCGTG AGCTCAATGG AGGTGCAGTC ACTCGCTATG TCGACAACAA TTTCTGTGGC 
780 7 90 800 810 820 830 840 

X Thai 
x Thai 

X Mvnl 
x Mvnl 

> < Rsal >< HinPlI 

> < NlalV >< Hin6I 

><: Kpnl >< Hhal 

X Eco64I >< cfoI 

X Csp6I >< Bst0I 

> < BscBI >< BstUI 
>< Banl >< BspSOI 
X Asp718 >< Bsp50I 

> < Afal >< AciI 
X ACCB1I >< AccII 
X Acc65I x Mnll x SfaNI x AccII 



> < Vnel 

> < Snol 

>< Sdul 

NspII X 
HgiAI X 
Bspl286I X 

>< Bmyl 

> < ApaLI 

> < Alw44I 
Alw21I X 



CCAGATGGGT ACCCTCTTGA TTGCATCAAA GATTTTCTCG CACGCGCGGG CAAGTCAATG TGCACTCTTT 



860 



870 



890 



900 



910 



X TthHB8I 
X TthH88r 

X Taql 
x Taql 

x Mnll 
X Ksp632I 
x Hinflx Plel 
x EaralI04I x MboII 

c Earl > < Bbvix AccI 



Nlalll x 
x Nlalll 
x Maelll EcoRII x 

Fnu4HI DsaV x 



' ~ - - duu'\ ncci -'v cnuini DsaV x 

CCGAACAACT TGATTACATC GAGTCGAAGA GAGGTGTCTA CTGCTGCCGT GACCATGAGC ATGAAATTGC 



940 



950 



960 



970 



x TthHB8I 
X Taql 
X Sful 

X NspVX Tru9I 
X LspIX Msel 
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X Mval >< Hin6I >< Sdul >< Csp45I 

>< Ecll36I >< Hhal >< NspII X BstBI 

>< BstOI >< Haell X HgiAI >< Bspll9I 

>< BstNI >< Eco47III >< Bspl286I >< BsiCI 

>< BsiLI >< Cfol X Bmyl >< Bpul4I 

>< Apyl X Ddel X Bspl43II >< Alul >< Alw21I >< AsuII 

CTGGTTCACT 6AGCGCTCTG ATAAGAGCTA CGAGCACCAG ACACCCTTCG AAATTAAGAG TGCCAAGAAA 

990 1000 1010 1020 1030 1040 1050 

X Tru9I 
>< BsmI X Hsel 

>< BscCI > < Mnll 

TTTGACACTT TCAAAGGGGA ATGCCCAAAG TTTGTGTTTC CTCTTAACTC AAAAGTCAAA GTCATTCAAC 
1060 1070 1080 1090 1100 1110 1120 

X Pull 
x PmaCI 
X Maell 
x Eco72I 

X BsaAI >< Nlalll >< Rsal 

X BbrPI X Bstll07I X Csp6I 

X Afllll >< Mnllx Ddel >< AccI >< Afal 

CACGTGTTGA AAAGAAAAAG ACTGAGGGTT TCATGGGGCG TATACGCTCT GTGTACCCTG TTGCATCTCC 
1130 1140 1150 1160 .1170 1180 1190 

>< SfaNI 

x Maelll x AccI Nlalll x 

ACAGGAGTGT AACAATATGC ACTTGTCTAC CTTGATGAAA TGTAATCATT GCGATGAAGT TTCATGGCAG 
1200 1210 1220 1230 1240 1250 1260 

>< Sim 

x Sau96I 
Pssl x 

>< Psp5II 
x PpuMI 
>< NspIV 

x NspHII 
x Eco47I 
x Drail 
x Cfrl3I 
x BsiZI 
X Bmel8I 
X Avail 
x Asul 

x Maell EcoO109l XAfllll > 

ACGTGCGACT TTCTGAAAGC CACTTGTGAA CATTGTGGCA CTGAAAATTT AGTTATTGAA GGACCTACTA 
1270 1280 1290 1300 1310 1320 1330 

Van91I X 

Sin! X 

x Rsal Sau96I x 

x Nspl PflMI >< 

X NlalV NspIV >< 

x Nlalll NspHII > 

X NspHIX Kpnl Eco47I X 

x Eco64I Cfrl3I x 

x Csp6I BslI x 

>< BscBI BsiZI x 

x BanI BsiYI x 

x Asp718 Bmel8l x 

x Afal Avail >< 

x AccBlI Asul X 



FIGURE 13. 4 
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>< Acc65I >< Sfcl >< Nlalll AccB7I >< 

CATGTGGGTA CCTACCTACT AATGCTGTAG TGAAAATGCC ATGTCCTGCC TGTCAAGACC CAGAGATTGG 
1340 13S0 1360 1370 1380 1390 1400 

>< TthHBBI 
>< TaqlX Mnll 
>< Hinfl 

>< Ddel >< piei x Acil 

ACCTGAGCAT AGTGTTGCAG ATTATCACAA CCACTCAAAC ATTGAAACTC GACTCCGCAA GGGAGGTAGG 
1410 1420 1430 1440 1450 1460 1470 

X Rmal NlalV X 

>< Mnll >< BsrI 

>< Mael x Bbvl x Fnu4HI BscBI X 

ACTAGATGTT TTGGAGGCTG TGTGTTTGCC TATGTTGGCT GCTATAATAA GCGTGCCTAC TGGGTTCCTC 
1480 1490 1500 1510 1520 1530 1540 

XhoII X 
Sau3AI X 
Ndell X 
Mfll X 

x Maelll Mbol x 

>< Pall x Eco31I Dpnll X 

x Haelll X BsrI X Mnll Dpnl > 

>< Rmal x BsuRI X BsrI . X BsmAI Bstlfl X 

x Mnll > < Ddel x BspWl x Bsalx HphI BspAI x 

>< Mael x BshlX Bgll x Alw26I Bspl43I > 

GTGCTAGTGC TGATATTGGC TCAGGCCATA CTGGCATTAC TGGTGACAAT GTGGAGACCT TGAATGAGGA 
1550 1560 1570 1580 1590 1600 1610 

> < Tru9I 

> < Msel 

X Maell X Tru9I 

>< Hpal > < Mnll 

x Hindll > < Ksp632I 

x Hinfl X'Plel x Hindi > < Earl 

x Alwl >< Ddel x Afllll x Msel > < Eamll04I 

TCTCCTTGAG ATACTGAGTC GTGAACGTGT TAACATTAAC ATTGTTGGCG ATTTTCATTT GAATGAAGAG 

1620 1630 1640 1650 1660 1670 1660 

X MboII PieI >< 

X BstXI x SfaNI > < Hinfl 

GTTGCCATCA TTTTGGCATC TTTCTCTGCT TCTACAAGTG CCTTTATTGA CACTATAAAG AGTCTTGATT 
1690 1700 1710 1720 1730 1740 1750. 

X Styl 
X Maelll 

x EcoT14I 
>< Plel x Ecol30I 

X Maelll >< BssTlI BslI X 

x Hinfix Acil x BsaJI BsiYI x 

ACAAGTCTTT CAAAACCATT GTTGAGTCCT GCGGTAACTA TAAAGTTACC AAGGGAAAGC CCGTAAAAGG 
1760 1770 1780 1790 1800 1810 1820 

X Sau3AI x Van91I 

x Ndell x PflMI 

>< Mbol x Drain 

x Dpnll x BslI 

>< Dpnl x Tru9I X BsiYI 
x BspAI >< Msel >< Bbvl 

x Bspl43I X AccB7I 



FIGURE 135 
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x Thai 
>< SfaNI 
X Mvnl 
X HinPlI 
x HinPlI 

x Hin6I. 
X Hin6I 

x Hhal 
>< Sau3AI X Hhal 

X Ndell >< Cfol PvuII > 

X Mbol X Cfol Psp5I > 

>< Dpn.II X BstOI NspBII > 

x Dpnl X BssHII HphI X 

X BspAI X Bsp50I Fnu4HI >< 

X Bspl4 3I X AccII X Fhu4HI >< Bbvl Alul : 



X TthHB8I . 

X Styl 
>< Ncol 
x Hindll 
>< Hindi 
x Hinll 

X ECOT14I 
>< Eco57I 
X TaqlX ECO130I 
X Sail >< Dsal 
x Rtrl X BssTH 
X BsaHI 

x Bbillx Hlalll 

X Maelll X Acyl X Hgal 

>< Bbvl X Maell x Acclx BsaJI HphI x 

CTGTCACCAT ACTTGATGGT ATTTCTGAAC AGTCATTACG TCTTGTCGAC GCCATGGTTT ATACTTCAGA 
1970 1980 1990 2000 2010 2020 2030 

x Rsal 

X Ndel > < Csp6I 

X BspMI x Maelll x BsrI x Afal x Ddel 

CCTGCTCACC. AACAGTGTCA TTATTATGGC ATATGTAACT GGTGGTCTTG TACAACAGAC TTCTCAGTGG 
2040 2050 2060 2070 2080 2090 2100 

X Stul 

X Pall 

x Haelll 

X Ecol47I 
>< Sdul x Ddel 

x NspII >< BsuRI 

X Bspl286I x BshI Ddel x 

x Bmyl x AatI > < Mnll Bfrl >< 

TTGTCTAATC TTTTGGGCAC TACTGTTGAA AAACTCAGGC CTATCTTTGA ATGGATTGAG GCGAAACTTA 
2110 2120 2130 2140 2150 2160 2170 

X Tfil 

x Hinfl Tthllll x 

x SfaNI x Bsgl >< Fokl Aspl >< 

GTGCAGGAGT TGAATTTCTC AAGGATGCTT GGGAGATTCT CAAATTTCTC ATTACAGGTG TTTTTGACAT 
21B0 2190 2200 2210 2220 2230 2240 



FIGURE 13.6 
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Tru9I >< 
Hsel x 
Hpal > 

^ r. c Hindll > 

>< Eco571 Hindi > 

CGTCAAGGGT CAAATACAGG TTGCTTCAGA TAACATCAAG GATTGTGTAA AATGCTTCAT TGATGTTGTT 
2250 2260 2270 2280 2290 2300 23X0 

X Sau3AI 
>< Ndell 
>< Mbol 

> < Maelll >< Sau3AI 

>< Fbal >< Ndell 

>< Dpnri >< opnll 

>K D P nI >< DpnIMboII X 

>< BspAI >< HinPlI Ddel X 

X Bspl43I >< Hin6I X Bspl43I 

x TthHBBI X BsiQI >< Hhal X MboIBfrl x 

X TaqI >< Bell >< Cfol >< BspAI Bbsl x 
AACAAGGCAC TCGAAATGTG CATTGATCAA GTCACTATCG CTGGCGCAAA GTTGCGATCA CTCAACTTAG 

2320 2330 2340 2350 2360 2370 2380 

>< PvuII 

x Maell >< PspSI 

X Bstll07I x NspBII 

X BsaAI Fnu4HI X 
X Bbvl > < Fnu4HI 

"P h I >< DrdI x AccI >< ALuI 

GTGAAGTCTT CATCGCTCAA AGCAAGGGAC TTTACCGTCA GTGTATACGT GGCAAGGAGC AGCTGCAACT 
2390 2400 2410 2420 2430 2440 2450 

X Tru9I 

x KlalV 
x Msel 

X Mnll 

>< Es P« X seal 

X Eco64I >^ RsaJ 

X BscBI X NlalUHnll X 

X Nlalll X BanI MnlI >K 

AflIr >< Tfil x Cs p6I 

X Bbvl x AccBlI x Maelll x Hinfl X HphI X Afal 
ACTCATGCCT CTTAAGGCAC CAAAAGAAGT AACCTTTCTT GAAGGTGATT CACATGACAC AGTACTTACC 

2460 2470 2480 2490 2500 2510 2520 

> < Xhol 

X TthHBBI 
X TthHB8Ix TaqI 

> < Slal 

> < PaeR7I 

> < NspIII 

x HphI X Hiflll 

> < Eco88I 

> < Ccrl 

X Esp3I X BsaHI. 

> < Bcol 

x BsmAI X Bbill 

> < Aval x Hgal 
X TaqI > < Ama87IX BsraBt 

X DdelX Mnll >< Alw26I x Acyl x Alul 

TCTGAGGAGG TTGTTCTCAA GAACGGTGAA CTCGAAGCAC TCGAGACGCC CGTTGATAGC TTCACAAATG 
2530 2540 2550 2560 2570 2580 2590 



FIGURE 13.7 
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>< Pall x Nlalll 
>< Haelll >< Mnll 
>< BsuRI >< Ddel X Tru9I 
>< Alul x BsrI >< BshI >< Bfrl >< Msel 

GAGCTATCGT TGGCACACCA GTCTGTGTAA ATGGCCTCAT GCTCTTAGAG ATTAAGGACA AAGAACAATA 



2610 



2620 



2630 



2640 



2650 



2660 



X MstI 
x HinPlI 
X Hin6I 
>< Hhal 
X Fspl 
X Fdill 
x Cfol 
X Avill 



>< ScrFI 
X Mval 
x EcoRII 

X Eel 13 61 
>< DsaV 
X BstOI 
X BstNI 
X BsmAI 
X BsiLI 
>< Apyl . 
Alw26I 



>< Vnel 

Tru9I X 
X Snol 

>< Sdul 
x Nspir 
Msel x 
>< HgiAI 
Bspl286I XBslI X 
BsiYI X 
X Bmyl 
X ApaLI 
X Tru9I X Alw44I 

x BsrI >< Msel >< Alw21I 

CTGCGCATTG TCTCCTGGTT TACTGGCTAC AAACAATGTC TTTCGCTTAA AAGGGGGTGC ACCAATTAAA 
2670 2680 2690 2700 2710 2720 2730 

X Maelll >< Mborl > < Maelll >< Hinfl Alul X 

GGTGTAACCT TTGGAGAAGA TACTGTTTGG GAAGTTCAAG GTTACAAGAA TGTGAGAATC ACATTTGAGC 
2740 2750 2760 2770 2780 2790 2800 

x Rsal 
x Nlaiv 
Maelll x 
x Msplx Kpnl 
>< Hpall 
x HapII 

> < Eco64I 
x Csp6I 

>< Tfil >< BscBI 

> < BanI 

> < Asp7l8 
X Hinfl x Afal 

> < AccBlI 
< Acc65I 



x Mae I I 

x Hindu 
>< Hindi 

X AflHI 



X Sdul 
>< NspII 
>< HgiAI 
>< Bspl286I 
X Bmyl 
>< Alw21I 

X AccI 



«gci p < ACC65I 

TTGATGAACG TGTTGACAAA GTGCTTAATG AAAAGTGCTC TGTCTACACT GTTGAATCCG GTACCGAAGT 
2810 2820 2830 2840 2850 2860 2870 

x Sau3AI 
X Ndell 
X Mbol 
X DpnII 

>< > < Dpnl 

X NspHI >< Mbol I >< BspAI 

^ ~, r >K NlaIn > < BsrI > < Bspl43I 

X Ddel >< Mnll x AlwNI >< Bbsl >< AlwNI 

TACTGAGTTT GCATGTGTTG TAGCAGAGGC TGTTGTGAAG ACTTTACAAC CAGTTTCTGA TCTCCTTACC 
2880 2890 2900 2910 2920 2930 2940 

X Sau3AI 
x Ndell 
x Mbol 
x DpnII 

>< Dpnl 
x BspAI 



FIGURE 13.8 
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X Nlalllx Bspl43l X Alul >< SfaNI 

AACATGGGTA TTGATCTTGA TGAGTGGAGT GTAGCTACAT TCTACTTATT TGATGATGCT GGTGAAGAAA 
2950 2960 2970 2980 2990 3000 3010 

X SfaNI 
X Mnll 

X MboII x Gsul X Ksp632I >< Mnll 

x BsaAI x Earl > < MboII 

>< HphI x Maellx Bpml X Mnll x Eamll04I x MboII 

ACTTTTCATC ACGTATGTAT TGTTCCTTTT ACCCTCCAGA TGAGGAAGAA GAGGACGATG CAGAGTGTGA 
3020 3030 3040 3050 3060 3070 3080 

> < Rsal 
x Rsal 
x Nlalll 

x Mnll x Fokl 

x Csp6I Eco31I x 

x CspSI x MamI BsmAI >< 

x MboII > < Afal x BsiBI Bsal x. 

X MboII x Afal X BsaBIAlw26I X 

GGAAGAAGAA ATTGATGAAA CCTGTGAACA TGAGTACGGT ACAGAGGATG ATTATCAAGG TCTCCCTCTG 
3090 3100 3110 3120 3130 3140 3150 

X NlalVX Pvullx XmnI 
X Eco64I x Psp5I >< TthHB8I 
x Mnll x Ddel x TaqI >< Mnll x MboII 

X BscBIX NspBII >< Mnll x Ksp632I x MboII X MboII 

x BanI x Mnll x Earl >< BsrI 

X AccBlI X Alul X Asp700I X Eamll04I X MboIIX Bbsl 

GAATTTGGTG CCTCAGCTGA AACAGTTCGA GTTGAGGAAG AAGAAGAGGA AGACTGGCTG GATGATACTA 
3160 3170 3180 3190 3200 3210 3220 

X Tru9I 

>< fokl X Msel X Eco57I 

x Ddel >< Bsrix MboII BsrI x 

CTGAGCAATC AGAGATTGAG CCAGAACCAG AACCXACACC TGAAGAACCA GTTAATCAGT TTACTGGTTA 
3230 3240 3250 3260 3270 3280 3290 

X Tru9I >< Mnll 

X Msel x Tru9I x Rindllx Tru9I x Dralll 

•X Oral >< Msel X Hincllx Msel X BspWI 

TTTAAAACTT ACTGACAATG TTGCCATTAA ATGTGTTGAC ATCGTTAAGG AGGCACAAAG TGCTAATCCT 

3300 3310 3320 3330 3340 3350 3360 

x Vnel 
X Snol 

> < Sdul 

> < NspII 

> < HglAI 

> < Bspl286I 

x ApaLI 

x HphI > < Nlalll X Alw44I 

x Bbvl X Fnu4HI X BspMI > < Alw21I 

ATGGTGATTG TAAATGCTGC TAACATACAC CTGAAACATG GTGGTGGTGT AGCAGGTGCA CTCAACAAGG 
. 3370 3380 3390 3400 3410 3420 3430 

x Sau96I 
x Pali 

X NspIV 
X Haelll 

X NlalV >< C frl3I 



FIGURE 13.9 
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>< Eco64I >< BsuRI 

>< BscBI > < Tru9I X BsiZI 

>K BanI > < Msel >< BshI >< Mnll 

>< AccBlIx Nlalll x Alul x Asul >< Mnll 

CAACCAATGG TGCCATGCAA AAGGAGAGTG ATGATTACAT TAAGCTAAAT GGCCCTCTTA CAGTAGGAGG 
3440 3450 3460 • 3470 3480 3490 3500 

x sim 

x Sau96I 
x tJspIV 
>< NspHIX NspHII 
X Eco47I 
x Cfrl3I 
x Nlalll >< BspMI 
X BsiZI 
x BmelSI 
x Avail Mnll x 
> < Ddel X Nsplx Asul Fokl >< 

•GTCTTGTTTG CTTTCTGGAC ATAATCTTGC TAAGAAGTGT CTGCATGTTG TTGGACCTAA CCTAAATGCA 
3510 3520 3530 3540 3550 3560 3570 

> < Tru9I 
>< Hphl> < Msel 
x Esp4I 
x Alul > < Ndel 

X AflUX Fnu4HI X Bbvl 
GGTGAGGACA TCCAGCTTCT TAAGGCAGCA TATGAAAATT TCAATTCACA GGACATCTTA CTTGCACCAT 
3580 3590 3600 3610 3620 3630 3640 

Rsal X 
CspSI X 

x Eco57I >< Bcgl Afal X 

TGTTGTCAGC AGGCATATTT GGTGCTAAAC CACTTCAGTC TTTACAAGTG TGCGTGCAGA CGGTTCGTAC 
3650 3660 3670 3680 3690 3700 3710 

>< Bsgl >< BspMI 

x Bcgl/a x Alul x Nlalll 

ACAGGTTTAT ATTGCAGTCA ATGACAAAGC TCTTTATGAG CAGGTTGTCA TGGATTATCT T6ATAACCTG 
3720 3730 3740 3750 3760 3770 3780 

X Mnll 

X Rroal > < Mnll x NlalV x Tfil X MboII 

>< Mael x Eco57I x BscBI x Hinfl >< Ddel 

AAGCCTAGAG TGGAAGCACC TAAACAAGAG GAGCCACCAA ACACAGAAGA TTCCAAAACT GAGGAGAAAT 
3790 3800 3810 3820 3830 3840 3850 

x Tru9I 

x StuI 
x Pall 

x Msel x Mnll x Maelll 
x Haelll x Eco0651 

X Ecol47I x Eco91I 

X Rsal >< BsuRI BstXI X 

X Csp6I X TthHB8I X BshI X BstPI 

X Afal X TaqI x AatI X BstEII 

CTGTCGTACA GAAGCCTGTC GATGTGAAGC CAAAAATTAA GGCCTGCATT GATGAGGTTA CCACAACACT 
3860 3870 3880 3890 3900 3910 3920 

Tfil X 
Nlalll x 
Hinfl X 

>< Wei x EcoRV x Hindlll 

FIGURE 13.10 
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X BsrI >< MboII >< Maelll x Eco32I >< Alul 

GGAAGAAACT AAGTTTCTTA CCAATAAGTT ACTCTTGTTT GCTGATATCA ATGGTAAGCT TTACCATGAT 
3930 3940 3950 3960 3970 3980 3990 

>< Nspl 
>< NspHI 

>< Main x SfaNI 

>< Mnll > < EcoMI 

>< Ddel x Mbon X BslI > < fJlalll 

x Ddel x Bfrl X HphI X BsiYI >< Fokl 

TCTCAGAACA TGCTTAGAGG TGAAGATATG TCTTTCCTTG AGAAGGATGC ACCTTACATG GTAGGTGATG 
4000 4010 4020 4030 4040 4050 4060 

X Spel 
X Rmat 

X Mael x EcoRVX HphI x SfaNI 

x HphI >< Eco32I >< Mnll X Ddel 

TTATCACTAG TGGTGATATC ACTTGTGTTG TAATACCCTC CAAAAAGGCT GGTGGCACXA CTGAGATGCT 
4070 4080 4090 4100 4110 4120 4130 

x ScrFI 
X Rsal 

X Mval 
X ECOR1I 

X EC1136I 
X DsaV 
x Csp6I >< EcoNI 
X BstOI 
X BstNI 
X BsiLI 
x BsaJI 
x BsaAI x BslI 
X MboII x Maellx Apyl 

>< Alul >< BsrI >< Afal X BsiYI 

CTCAAGAGCT TTGAAGAAAG TGCCAGTTGA TGAGTATATA ACCACGTACC CTGGACAAGG ATGTGCTGGT 
4140 4150 4160 4170 4180 4190 4200 

x Tru9I 
x Msel 

X Odel >< Esp4I >< Rsal 

>< Mnll >< BspWI >< Csp6I 

x Fokl X Alul x Aflll >< Eco57I >< Afal 

TATACACTTG AGGAAGCTAA GACTGCTCTT AAGAAATGCA AATCTGCATT TTATGTACTA CCTTCAGAAG 

4210 4220 4230 4240 4250 4260 4270 

>< ScrFI 
x Mval 
x EcoRII 

x XmnI >< EC1136I Hlalll x 

> < Ksp632I x Rraal >< DsaV Ksp632I x 

> < Earl > < Tfilx MboII x BstOI x Earl 

> < Eamll04I x Mael x BstNI Earall04I X 

> < Ddel > < Hinfl x BsiLI BsmAI x 
X BspWI X Asp700I X Apyl Aiw26I X 

CACCTAATGC TAAGGAAGAG ATTCTAGGAA CTGTATCCTG GAATTTGAGA GAAATGCTTG CTCATGCTGA 
4280 4290 4300 4310 4320 4330 4340 

X Vspl x Zsp2I 

X Tru9I x PpulOI 
x Msel X Nsil 

x MboII x main x Fokl • 

x Eco57I x Mphll03I x Fokl 



FIGURE 13. 11 
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>< Asnl >< EcoT22I >< BspWI 

>< Asel >< Avalll >< Bgll >< Maell 

AGAGACAAGA AAATTAATGC CTATATGCAT GGATGTTAGA GCCATAATGG CAACCATCCA ACGTAAGTAT 
4350 4360 4370 4380 4390 4400 4410 

x SfaHI 

>< Tru9I > < Hindll >< Tfil >< S pel 

>< Msel > < HincIIX MboII >< Rmal 

>< "nil x DrdI >< Hinfl >< Mael 

AAAGGAATTA AAATTCAAGA GGGCATCGTT GACTATGGTG TCCGATTCTT CTTTTATACT AGTAAAGAGC 
4420 4430 4440 4450 4460 4470 4480 

x Maelll 

>< sfcI X Fnu4HI x Muni 

>< A lul >< Alul x Acil Maelll x 

CTGTAGCTTC TATTATTACG AAGCTGAACT CTCTAAATGA GCCGCTTGTC ACAATGCCAA TTGGTTATGT 
4490 4500 4510 4520 4530 4540 4550 

>< Thai 
x Hvnl 

x MboII 
x HlnPlI 
>< HinPlI 

x Hin6I 
x Hin6I 

x Hhal 
>< Hhal 
X Fnu4HI 

x Cfol 
x Cfol 
X BstOI 
x BssHIIX BspWI x Tru9I 
X Bsp50I x Msel 

>< AccII x Alul HphI X 

GACACATGGT TTTAATCTTG AAGAGGCTGC GCGCTGTATG CGTTCTCTTA AAGCTCCTGC CGTAGTGTCA 
4560 4570 4580 4590 4600 4610 4620 

x Maelll 

X SfaNI >< AlwNI >< Mnl i >< M nllx Ddel 

GTATCATCAC CAGATGCTGT TACTACATAT AATGGATACC TCACTTCGTC ATCAAAGACA TCTGAGGAGC 
4630 4640 4650 4660 4570 4680 4690 

X SinI 
x Sau96I 
x NspIV 

X NspHII 
X EC047I 
x Cfrl3I 
X BsiZI 
X BmelBI 
>< Avail 
x Asul 



x Tru9I 
x Nlalll 

X Msel 
X Moll 
>< Ksp632I 
x Earl 
X Eamll04I 
x Bbvl 



X Sdul 
X NspII 
X HgiAI 
X Bspl286I 
X Bmyl 
X Alw21I 



>< Rsal 
x Csp6I 
X Afal 



" Asui >< Afal 

ACTTTGTAGA AACAGTTTCT TTGGCTGGCT CTTACAGAGA TTGGTCCTAT TCAGGACAGC GTACAGAGTT 
4700 4710 4720 4730 4740 4750 4760 

> < TthHB8I 

> < TaqI 
>< Sdul 

>< Van91I x NspII 

x Tru9I x Rsal >< PflMI x Eco24I 

>< Msel x HphI X BslI X Bspl286I 

>< Esp4I x Csp6I X BsiVI x Bmyl Gsul X 
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x Aflll >< Maelll >< Afal >< AccB7I >< BanllBpml >< 

AGGTGTTGAA TTTCTTAAGC GTGGTGACAA AATTGTGTAC CACACTCTGG AGAGCCCCGT CGAGTTTCAT 
4770 4780 4790 .4600 4810 4820 4830 

>< Tru9I 
>< Plel >< EcoNI 
>< Mnll >< BslI 
X BsmAI >< BsiYI 
>< Mnll x HphI X HinfIX Alw26IX Acil x Msel 

CTTGACGGTG AGGTTCTTTC ACTTGACAAA CTAAAGAGTC TCTTATCCCT GCGGGAGGTT AAGACTATAA 
4840 4850 4860 4870 4880 4890 4900 

X Alul x Ndel 

AAGTGTTCAC AACTGTGGAC AACACTAATC TCCACACACA GCTTGTGGAT ATGTCTATGA CATATGGACA 
4910 4 920 4930 4940 4 950 4960 4970 

x SinI 
X Sau961 
X NspIV 

X NspHII 
X Eco47I 

X Cfrl3I Nlalll X 

x BsiZI x Nlalll 

x Bmel8I > < Mnll 

X Avail X Maelll X Tru9I X Mnll 

>< Asul x Fokl x Msel x BspHI 

GCAGTTTGGT CCAACATACT TGGATGGTGC TGATGTTACA AAAATTAAAC CTCATGTAAA TCATGAGGGT 
4980 4990 5000 5010 5020 5030 5040 



X Rsal > < TaqI 

> < Rroal x SnaBI x Seal 

> < Mael x Maell x Hindlll x Rsal 
X Csp6I X EcolOSI X Csp6I 

>< Afal x BsaAI x Alul x Afal 



> < Csp6I x Tru9I Mnll > 

X Afllll X Msel BslI X 

X Afal X Dral BsiYI X 

ATGAGAGTTT TCTTGGTAGG TACATGTCTG CTTTAAACCA CACAAAGAAA TGGAAATTTC CTCAAGTTGG 
5120 5130 5140 5150 5160 5170 5180 

x Tru9I x Tru9I x Rroal 

X Msel X Msel x Muni x Mael Alul > 

TGGTTTAACT TCAATTAAAT GGGCTGATAA CAATTGTTAT TTGTCTAGTG TTTTATTAGC ACTTCAACAG 
5190 52O0 5210 5220 5230 5240 52S0 

X SfaNI 

x Sdul 

x Nspll 

X Eco24I 

X B3pl286I 

X Bmyl HphI > 

X Bbvl Fnu4HI X 

x Mnll X Banll X BspWI 



FIGURE 13.13 
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CTTGAAGTCA AATTCAATGC ACCAGCACTT CAAGAGGCTT ATTATAGAGC CCGTGCTGGT GATGCTGCTA 
5260 5270 5280 5290 5300 5310 5320 

X Vnel 
>< Snol 

>< Sdul 

>< Nspl I 

X HgiAI 

>< Bspl286I 

>< Bmyl 
>< ApaLI 

>< Alw44I MboII X 

>< Alw21I >< Alul >< HphI 

ACTTTTGTGC ACTCATACTC GCTTACAGTA ATAAAACTGT TGGCGAGCTT GGTGATGTCA GAGAAACTAT 
5330 5340 S3S0 5360 5370 5380 5390 

> < SphI 

> < Pael 

> < Nspl 

> < NspHI X Tfil x Tru9I 
>< Sfcl > < Nlalllx Hinfl x Msel 

GACCCATCTT CTACAGCATG CTAATTTGGA ATCTGCAAAG CGAGTTCTTA ATGTGGTGTG TAAACATTGT 

5400 5410 5420 5430 5440 5450 5460 

x Rsal 

X Tru9I > < Csp6I Esp4I > 

x Msel >< Alul x Afal Aflll > 

GGTCAGAAAA CTACTACCTT AACGGGTGTA GAAGCTGTGA TGTATATGGG TACTCTATCT TATGATAA1C 
5470 5480 5490 5500 5510 5520 5530 

x Rsal 

X MboII 
x RmalHinCl x 
X Csp6I 

X Tru9I >< SfaNI >< Mae I x Bbsl 

>< Msel x Klalll x Afal 

TTAAGACAGG TGTTTCCATT CCATGTGTGT GTGGTCGTGA TGCTACACAA TATCTAGTAC AACAAGAGTC 
5540 5550 5560 5570 5580 5590 5600 

x Rsal 

x Plel > < Odel X Csp6I 

>< Bsgl x BspWI x BspMI x Afal 

TTCTTTTGTT ATGATGTCTG CACCACCTGC TGAGTATAAA TTACAGCAAG GTACATTCTT ATGTGCGAAT 
5610 5620 5630 5640 5650 5660 5670 

>< Eco31I 

X Rsal >< Ddel 

> < Maelll X BsmAI 

X Csp6I >< Bsal Mnll x 

X Afal X BsrI X Alw26I HphI > 



x SstI x SinI 

X Sdul x Sau96I 



: Sad >< NspIV 

x Nspl I x NspHI I 

X HgiAI ' > < Rsal X Maelll 

x Eco24I x Eco47I 

x Ecll36II >< CfrUI 

X Bspl286I >< BsiZI 

>< Bmyl x BmelBI 

FIGURE 13. 14 
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X Banll >< Avail 

>< Alw2ll >< Csp6I>< Asul 

>< Alul > < Afal >< Bsrix AlwNI 

ACGGAGCTCA CCTTACAAAG ATGTCAGAGT ACAAAGGACC AGTGACTGAT GTTTTCTACA AGGAAACATC 
5750 5760 5770 5780 5790 5800 5810 

>< TthHB8I 

x TaqI >< Maelll 

TTACACTACA ACCATCAAGC CTGTGTCGTA TAAACTCGAT GGAGTTACTT ACACAGAGAT TGAACCAAAA 
5820 5830 5840 5850 5860 5870 5880 

>< Rsal 
>< Csp6I 
>< Sfcl >< Bbvl 
>< Fokl >< Fnu4HI >< Afal 

TTGGATGGGT ATTATAAAAA GGATAATGCT TACTATACAG AGCAGCCTAT AGACCTTGTA CCAACTCAAC 
5890 5900 5910 5920 5930 5940 5950 

Tru9I >< 
Swal >< 
Msel >< 

> < Nspl MamI >< 

> < NspHI Dral >< 

> < Nlalll BsiBI >< 
>< Afllll BsaBI X 

CATTACCAAA TGCGAGTTTT GATAATTTCA AACTCACATG TTCTAACACA AAATTTGCTG ATGATTTAAA 
5960 5970 5980 5990 6000 6010 6020 

>< MboII 
X Alul X AluIX MacIII 

TCAAATGACA GGCTTCACAA AGCCAGCTTC ACGAGAGCTA TCTGTCACAT TCTTCCCAGA CTTGAATGGC 
6030 6040 6050 6060 6070 6080 6090 

X Sfcl 

GATGTAGTGG CTATTGACTA TAGACACTAT TCAGCGAGTT TCAAGAAAGG TGCTAAATTA CTGCATAAGC 
6100 6110 6120 6130 6140 6150 6160 

X Tru9I 

x ScrFl 
X Mval 
x Msel 

x EcoRII 

X Ecll36I 
X DsaV 

x BstOI 

>< BstNI Maell x 

>< Muni x BsiLI X Drain 

>< BstXI X Apyl x Maell x BstXI 

CAATTGTTTG GCACATTAAC CAGGCTACAA CCAAGACAAC GTTCAAACCA AACACTTGGT GTTTACGTTG 

6170 6180 . 6190 6200 6210 6220 6230 

> < Rsal 

>< Cs P 61 MboII X 

> < Afalx BsrI >< Bbsl 
TCTTTGGAGT ACAAAGCCAG TAGATACTTC AAATTCATTT GAAGTTCTGG CAGTAGAAGA CACACAAGGA 

6240 6250 6260 6270 6280 6290 6300 

>< Hindu x MboII 

X Hindi x Mnll x Eco57I 

ATGGACAATC TTGCTTGTGA AAGTCAACAA CCCACCTCTG AAGAAGTAGT GGAAAATCCT ACCATACAGA 
6310 6320 6330 6340 6350 6360 6370 



FIGURE 13.15 
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>< Maelll >< Tru9I 

>< Maell >< Msel 

AGGAAGTCAT AGAGTGTGAC GTGAAAACTA CCGAAGTTGT AGGCAATGTC ATACTTAAAC CATCAGATGA 
6380 6390 6400 6410 6420 6430 6440 

X XhoII 
>< Sau3AI 
>< Nlalll 
X Ndell 
x Mfll 
x Mbol 
x DpnII 

>< BstYI 

. x Tru9I x BspAI 

x Msel X BspHI X Bspl43IX Fnu4HI 

> < Maelll X Mnll X Bbvl X Alwl 

AGGTGTTAAA GTAACACAAG AGTTAGGTCA TGAGGATCTT ATGGCTGCTT ATGTGGAAAA CACAAGCATT 
6450 6460 6470 6480 6490 6500 6510 

x Saul 
X Rmal 

X Mstll 
>< Mael 

>< EcoBll 

x Ode I 

>< Cvnl 

X Bsu36I 

X Bse21I 

x Bfrl> < Tru9I 

x Tru9l x Axyl> < Mselx Muni >< Nlalll 

X Msel >< Alul X Aocl X Dral X Bbvl Fnu4HI X ■ 

ACCATTAAGA AACCTAATGA GCTTTCACTA GCCTTAGGTT TAAAAACAAT TGCCACTCAT GGTATTGCTG 
6520 6530 6S40 6550 6560 6570 6580 

x Vspl X Styl 

X Tru9I x EcoT14I > < Ddel 

x Msel >< Ecol30I >< BslI 

X Asnl X BssTlI X BsiYI 

x Asel x BsaJI > < Bfrl x Fnu4HI 



x HinPlI 

X Hin6I X Tru9I 

x Hhal x Maelix Msel 

x Ddel >< Dram 

x Bbvl x Cfol x Afllll 



X Rsal > < Rsalx Xbal 

x Csp6I x Csp6I x Rmal 

x Muni x Afal > < Afal x Mael X Alul 

TTGTTCCAAT TGTGTACTTT TACTAAAAGT ACCAATTCTA GAATTAGAGC TTCACTACCr ACAACTATTG 
6730 6740 6750 6760 6770 6780 6790 



X Vspl 
X Tru9I 
X Kael 
>< Mspl 

X Msel 



FIGURE 13. 16 
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>< Hpall 
>< HapII 
>< CfrlOI X Fokl 
>< Tru9I >< Asnl 

>< Msel >< SfaNI >< Aselx Hphlx Maelll 

CTAAAAATAG TGTTAAGAGT GTTGCTAAAT TATGTTTGGA TGCCGGCATT AATTATGTGA AGTCACCCAA 
6800 6810 6820 6830 6840 • 6850 6860 

X Tru9I X DdeX Maelll > 

X Msel x Bfrl X Bbvl 

ATTTTCTAAA TTGTTCACAA TCGCTATGTG GCTATTGTTG TTAAGTATTT GCTTAGGTTC TCTAATCTGT 
6B70 6880 6890 6900 6910 6920 6930 

X Sdul 
X NspII 
X HgiAI 

> < Rsal >< Bspl286I 

X Csp6I x Bmyl 

X Fnu4HI > < Afal x Alw21I 

GTAACTGCTG CTTTTGGTGT ACTCTTATCT AATTTTGGTG CTCCTTCTTA TTGTAATGGC GTTAGAGAAT 
6940 6950 6960 6970 6980 6990 7000 

Tru9I X 
Msel x 

x Tru9I > < Maelll >< Fnu4HI 

x Msel >< Maell Bbvl > 

TGTATCTTAA TTCGTCTAAC GTTACTACTA TGGATTTCTG TGAAGGTTCT TTTCCTTGCA GCATTTGTTT 
7010 7020 7030 7040 7050 7060 7070 

> < Tfil RsaI >< 

x "ami X HphI 

> < Hinfl Csp6I X 

>< BsiBI X Xranix Maelll Alul > 

x Pleix Hinfl x BsaBI x Alul x Asp700I Afal x 

AAGTGGATTA GACTCCCTTG ATTCTTATCC AGCTCTTGAA ACCATTCAGG TGACGATTTC ATCGTACAAG 
7080 7090 7100 7110 7120 7130 7140 

X Pall 

X NspBII 
x Kaelll 
x Gdill 

X Fnu4HI 
x Eael 

x Ddel 
x BsuRI 

x Rmal >< BshI x BslI 

>< Mael >< Acilx BsiYI 

CTAGACTTGA CAATTTTAGG TCTGGCCGCT GAGTGGGTTT TGGCATATAT GTTGTTCACA AAATTCTTTT 
7150 7160 7170 7180 7190 7200 7210 

x BspMI >< Rmal 

x Alul x Mael 

ATTTATTAGG TCTTTCAGCT ATAATGCAGG TGTTCTTTGG CTATTTTGCT AGTCATTTCA TCAGCAATTC 
7220 7230 7240 7250 7260 7270 7280 

Rsal x 
X MboII 
x NlalV MamI x 

x Eco64I Csp6I x 

> < Rsal x BscBI BsiBI x 

>< Csp6I >< BanI BsaBI x 

> < Nlalll > < Afalx AccBlI Afal X 

FIGURE 13.17 
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TTGGCTCATG TGGTTTATCA TTAGTATTGT ACAAATGGCA CCCGTTTCTG CAATGGTTAG GATGTACATC 
7290 7300 7310 7320 7330 7340 7350 

TthHB8I >< 

>< TaqI 
Mnll X 

>< Nctel Ksp632I >< 

>< ECsp632I >< Fokl 

>< Earl >< MboII Earl X 

x Fokl x Eamll04IX Alulx MboII- x Nlalll Earall04I X 

TTCTTTGCTT CTTTCTACTA CATATGGAAG AGCTATGTTC ATATCATGGA TGGTTGCACC TCTTCGACTT 
7360 7370 7380 . 7390 7400 7410 7420 

XhoII X 
Sau3AI X 
Nlalll x 

Ndell X 
Mfll >c 
Mbol x 

x Thai > < Ksp632I 

x Mvnl > < Earl 

x HinPlI x Mlul > < Eamll04I 

>< Hin6I x BstUI DpnII x 

x Hhal x Bsp50I X Rsal BstVI x 

X Nlalll X Cfol X Afllll X Csp6I x Tru91 BspAI X 

>< BspWI x BspWI x AccII x Afal x Msel Bglll x 

GCATGATGTG CTATAAGCGC AATCGTGCCA CACGCGTTGA GTGTACAACT ATTGTTAATG GCATGAAGAG 
7430 7440 7450 7460 7470 7480 7490 

X Pall 
x Haelll 

>< Dsal X MunI 

x MboII x BsuRI Maelll X 

>< °PnI >< Bahl >< Muni BsmAI >< 

x Bspl4 3I x Mnll x BsaJI x Plelx Hinfl Alw261 X 

ATCTTTCTAT GTCTATGCAA ATGGAGGCCG TGGCTTCTGC AAGACTCACA ATTGGAATTG TCTCAATTGr 
7500 7510 7520 7530 7540 7550 7S60 

>< Rsal Tru9I x 

> < Csp6I MseI >< 

>< BsrI X Gsul X MaelllDral X 

>< Afal x Bpral > < BsrI 

GACACATTTT GCACTGGTAG TACATTCATT AGTGATGAAG TTGCTCGTGA TTTGTCACTC CAGTTTAAAA 
. 7570 7580 7590 7600 7610 7620 7630 

X Thai 
x Mvnl . 
> < HphI 
HinPlI x 

X HinPlI 

X Hin6I 
x Hin6I 
Hhal X 

x Hhal 
Cfol x 
x Cfol 
x BstUI 
X BssHII 
Bsp50I X 

> < BsrI >< accII 

GACCAATCAA CCCTACTGAC CAGTCATCGT ATATTGTTGA TAGTGTTGCT GTGAAAAATG GCGCGCTTCA 
7640 7650 7660 7670 7680 7690 7700 

FIGURE 13. 18 
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X Fokl 

>< BsmAI 

>< Mnll >< Alw26I >< Acil 

CCTCTACTTT GACAAGGCTG GTCAAAAGAC CTATGAGAGA CATCCGCTCT CCCATTTTGT CAATTTAGAC 
7710 7720 7730 7740 7750 7760 7770 

X Vspl 
X Tru9I 
x Msel 
X Asnl 

>■< Alul x Asel x Bcgl/a 

AATTTGAGAG CTAACAACAC TAAAGGTTCA CTGCCTATTA ATGTCATAGT TTTTGATGGC AAGTCCAAAT 
7780 7790 7800 7810 7820 7830 7840 

x sfci >< Pvuir 

>< Rsal x Psp5l 
>< Plel >< Csp6I x NspBII 

X Hinfl x Ddel x Bcgl X Afal x Alul 

GCGACGAGTC TGCTTCTAAG TCTGCTTCTG TGTACTACAG TCAGCTGATG TGCCAACCTA TTCTGTTGCT 
7850 7860 7870 7880 7890 7900 7910 

TthHB8I X 
TaqI X 
Salt X 
Rtrl X 

x seal Hindu > 

>< Rsal x Tru9I Hindi > 

X Csp6I >< SfaNI x Eco57I 

x Alul >< Maell x Afal x Msel Acel x 

TGACCAAGCT CTTGTATCAG ACGTTGGAGA TAGTACTGAA GTTTCCGTTA AGATGTTTGA TGCTTATGTC 
7920 7930 7940 7950 7960 7970 7980 

x Tru9I 
X Msel 

> < Esp4I >< Sfci 

> < Aflll X BspWI X Alul 
GACACCTTTT CAGCAACTTT TAGTGTTCCT ATGGAAAAAC TTAAGGCACT TGTTGCTACA GCTCACAGCG 

7990 8000 8010 8020 8030 8040 8050 

X PvuII 
X Psp5I 
X NspBII 
X Fnu4HI 

>< Alul >< Bbvl X Alul 

AGTTAGCAAA GGGTGTAGCT TTAGATGGTG TCCTTTCTAC ATTCGTGTCA GCTGCCCGAC AAGGTGTTGT 
8060 8070 8080 8090 8100 8110 8120 

Maelll x 

x Hindu >< BsmAI x Ddel 

x Hindi x Foklx Alw26I x Bfrl 

TGATACCGAT GTTGACACAA AGGATGTTAT TGAATGTCTC AAACTTTCAC ATCACTCTGA CTTAGAAGTG 
8130 8140 8150 8160 8170 8180 8190 ' 

X XhoII 
Sau3AI x 

X Ndell 
x Mfll 
x Mbol 
x Nlalll x Hgal 
x Hinll x DpnII 
Dpnl x 
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Bspl43I X 
>< BsaHI >< BstYI 

>< Maelllx HphI >< Bbill >< BspAI 

>< Maelll >< HphI • >< Nlalll >< AcyI >< BglI£ 

ACAGGTGACA GTTGTAACAA TTTCATGCTC ACCTATAATA AGG7TGAAAA CATGACGCCC AGAGATCTTG 
8200 8210 8220 8230 8240 8250 8260 

>< Nspl 

>< NspHI 

>< Nlalll 
X HinPlI 
>< HIn6l 
>< Hhal 

>< Cfol >< BspWI x Maelll 

GCGCATGTAT TGACTGTAAT GCAAGGCATA TCAATGCCCA AGTAGCAAAA AGTCACAATG TTTCACTCAT 
8270 8280 8290 8300 8310 8320 8330 

>< Nspl 

X NspHI x PvuII 

X Nlalll X PspSI 

X Eamll05I X NspBII 
X Bbvl x Fnu4HI 

>< Afllll x Alul >< Bbvl > < Fnu4HI 

CTGGAATGTA AAAGACTACA TGTCTTTATC TGAACAGCTG CGTAAACAAA TTCGTAGTGC TGCCAAGAAG 

8340 83S0 8360 8370 8380 8390 8400 

X Rmal 

X MboII x Mael x Eamll05I 

AACAACATAC CTTTTAGACT AACTTGTGCT ACAACTAGAC AGGTTGTCAA TGTCATAACT ACTAAAATCT 
8410 8420 8430 8440 8450 8460 8470 

X Tru9I 

X Pall 
x Msel 

x Haelll 
>< Seal x Bsp4I 

x Rsal x Tru9I >< BsuRI 

X Csp6I X Msel x BshI 

x Afal x Dral x Aflll x Bbvl 

CACTCAAGGG TGGTAAGATT GTTAGTACTT GTTTTAAACT TATGCTTAAG GCCACATTAT TGTGCGTTCr 
8480 8490 8500 8510 8520 8530 8540 

>< Rsal 
X Csp6I 

x BsrI >< Nlalll 

X Fnu4Ht >< Afal >< Maelll 

TGCTGCATTG GTTTGTTATA TCGTTATGCC AGTACATACA TTGTCAATCC ATGATGGTTA CACAAATGAA 
8550 8560 8570. 8580 8590 8600 8610 

x Maelll 
> < Maelll 

x Maelll >< Fokl 

ATCATTGGTT ACAAAGCCAT TCAGGATGGT GTCACTCGTG ACATCATTTC TACTGATGAT TGTTTTGCAA 
8620 8630 8640 8650 8660 8670 8680 

Sfcl > 

>< Ns Pl Fnu4HI X 

x NspHI x Nlalll Bbvl X 

x Nlalll x Hgal x BstXI x Bbvl x Alul 

ATAAACATGC TGGTTTTGAC GCATGGTTTA GCCAGCGTGG TGGTTCATAC AAAAATGACA AAAGCTGCCC 
8690 8700 8710 8720 8730 8740 8750 



FIGURE 13. 20 
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>< ScrFI 
>< ScrFI >< Rsal 

>< Mval >< Mspl 
>< EcoRII >< Hpall 
>< Ecll36I>< Neil 
>< DsaV >< Hapll 
>< BstOIx DsaV 
X BstNI X Csp6I 

X Fnu4HI >< BsiLI x BcnIDdel x 

x Alul >< Apyl >< Afal 

TGTAGTAGCT GCTATCATTA CAAGAGAGAT TGGTTTCATA GTGCCTGGCT TACCGGGTAC TGTGCTGAGA 
8760 8770 8780 8790 8800 8810 8820 

> < Maelll >< HphI x Mnll x BspWI 

GCAATCAA7G GTGACTTCTT GCATTTTCTA CCTCGTGTTT TTAGTGCTGT TGGCAACATT TGCTACACAC 
8830 8840 8850 8860 8870 8880 8890 

Tru9I > 
SfaNI x 
x Rsal 
Msel > 

x BspWI >< Fnu4HI X Csp6I 

X Bbvlx Mnll X Ddel X Afal 

CTTCCAAACT CATTGAGTAT AGTGATTTTG CTACCTCTGC TTGCGTTCTT GCTGCTGAGT GTACAATTTT 
8900 8910 8920 8930 8940 8950 8960 

>< Mnll 

x Fokl > < Mael 

TAAGGATGCT ATGGGCAAAC CTGTGCCATA TTGTTATGAC ACTAATTTGC TAGAGGGTTC TATTTCTTAT 
8970 8980 8990 9000 9010 9020 9030 

ScrFI > 

• - Mval > 

Mnll X 
EcoRII >< 
Ecll36l > 
DsaV X 
BstOI > 

X NlalV BstNI > 

x Fokl BsiLI > 

X Alul >< BscBI Apyl> 

AGTGAGCTTC GTCCAGACAC TCGTTATGTG CTTATGGATG GTTCCATCAT ACAGTTTCCT AACACITACC 
9040 9050 9060 9070 9080 9090 9100 

. . x Rsal 

x Sfcl >< Nspl 

x Seal >< Nspm 

>< SfaNI >< Rsal >< Nlalll 

> < Maelll >< Csp6I X Nlalll 

x Gsul x Afal X Csp6I 

x Bpml x Ddel >< AccI X Afal 

TGGAGGGTTC TGTTAGAGTA GTAACAACTT TTGATGCTGA GTACTGTAGA CATGGTACAT GCGAAAGGTC 
9110 9120 9130 9140 9150 9160 9170 

x SstI 
X Sdul 
x Sac I 
Nspl I x 
HgiAI X 
EC024I X 
Bspl286I x 



FIGURE 13.21 
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Ecll36II ><>< Bmyl 
Banll X 

>< Tru9I AJLH21I >< 

>< BsrI >< Msel >< Alul 

AGAAGTAGGT ATTTGCCTAT CTACCAGTGG TAGATGGGTT CTTAATAATG AGCATTACAG AGCTCTATCA 
9180 9190 9200 9210 9220 9230 9240 

>< Tfil 

>< SfaNI >< Hinfl >< Alul >< Mnll 

GGAGTTTTCT GTGGTGTTGA TGCGATGAAT CTCATAGCTA ACATCTTTAC TCCTCTTGTG CAACCTGTGG 
92S0 92G0 9270 9280 9290 9300 9310 



>< Eeo57I > < Bbvl Fnu4HI X 

GTGCTTTAGA TGTGTCTGCT TCAGTAGTGG CTGGTGGTAT TATTGCCATA TTGGTGACTT GTGCTGCCTA 
9320 9330 9340 9350 9360 9370 9380 

>< Rsal 
X Csp6I >< Nlalll 
>< Maell >< Bbvl >< Fnu4HI 

>< AflHI X AfalX HphI >< BspWI 

CTACTTTATG AAATTCAGAC GTGTTTTTGG TGAGTACAAC CATGTTGTTG CTGCTAATGC ACTTTTGTTT 
9390 9400 9110 9420 9430 9440 9450 

>< Rsal 
x MalV 
>< Kpnl 

X Eco64I > < ScrFI 

X Csp6I > < Neil 

>< BscBI >< Mspl 

>< Asp718 X Hpall 

x BanI x Alul x Hlnfl 

x Afal x HapII x PleT 

x AccBlI > < Bcnl > < Ddel 

: Acc65l x Alulx DsaV x AccI 



x Rsal 
X Csp6I 

x Afal x HphI x HphI Main x 

ACTTGTACTT GACATTCTAT TTCACCAATG ATGTTTCATT CTTGGCTCAC CTTCAATGGT TTGCCATGTT 
9530 9540 9550 9560 9570 9580 9590 



X TChHB8I 
x Rsal 
x Mnll 
x Mnll 

x Tru9l x Csp6I 

X Tru9I >< Plel x Bcgl/a X TaqI 

X Msel X Ddel X Nlalll X Bbvl 

x Eco57I >< Bfrl x Hinfl x Msel x Maelll x Afal Fnu4HI x 
TTCTTTAACA ACTATCTTAG GAAAAGAGTC ATGTTTAATG GAGTTACATT TAGTACCTTC GAGGAGGCTG 
9670 9680 9690 9700 9710 9720 9730 



FIGURE 13.22 
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>< Afal >< Afal >< Alw26I 

CTTTGTGTAC CTTTTTGCTC AACAAGGAAA TGTACCTAAA ATTGCGTAGC GAGACACTGT TGCCACTTAC 
9740 9750 9760 9770 9780 9790 9800 

X NlalV 
>< Rsal >< Ddel 

X Csp6I >< BscBI 

X Afal >< Bfrl Alul >< 

ACAGTATAAC AGGTATCTTG CTCTATATAA CAAGTACAAG TATTTCAGTG GAGCCTTAGA TACTACCAGC 
9810 9820 9830 9840 9850 9860 .9870 

>< Fnu4HI 

>< Ddel 

>< Fnu4HI >< Bfrl 
>< Bbvl X Alul >< Bbvl x Ddel X AlwNI 

TATCGTGAAG CAGCTTGCTG CCACTTAGCA AAGGCTCTAA ATGACTTTAG CAACTCAGGT GCTGATGTTC 
9880 9890 9900 9910 9920 9930 9940 

>< Sfcl >< Bsml 

>< PstI >< BscCI 

TCTACCAACC ACCACAGACA TCAATCACTT CTGCTGTTCT GCAGAGTGGT TTTAGGAAAA TGGCATTCCC 
9950 9960 9970 9980 9990 10000 10010 

>< Rsal 
>< Nlalll 

>< Maelll 

X Csp6X x Tru9I 

>< Afal >< Msel 

GTCAGGCAAA GTTGAAGGGT GCATGGTACA AGTAACCTGT GGAACTACAA CTCTTAATGG ATTGTGGTTG 



10020 10030 


10040 10050 


10060 10070 


10080 








Xholl x 








Sau3AI x 






>< Tru9I 


Ndell X 






X Nspl 


Mfll X 






>< NspHI 


Mbol x 




>< Nspl 


>< Nlalll 


Dpnl I x 


>< Fokl 


X NspHI 


>< Msel 


BstYI X 


>< Bstll07I 


x Nlalll 


X MboII BspAI X 


>< AccI 


X Afllll 


> < Bbsl 


Bglll X 


GATGACACAG TATACTGTCC AAGACATGTC ATTTGCACAG CAGAAGACAT GCTTAATCCT 


AACTATGAAG 


10O90 10100 


10110 10120 


10130 10140 


10150 








Pall > 








Msel > 








Haelll > 








Eael X 








BsuRI > 


>< Dpnl >< MboII 






BshI > 


>< Bspl43I 


>< Alul 




Ball > 


ATCTGCTCAT TCGCAAATCC AACCATAGCT TTCTTGTTCA GGCTGGCAAT GTTCAACTTC 


GTGTTATTGG 


10160 10170 


10180 10190 


10200 10210 


10220 



X Ddel> < Tru9I 

x Bfrl> < Msel X Ddel 

CCATTCTATG CAAAATTGTC TGCTTAGGCT TAAAGTTGAT ACTTCTAACC CTAAGACACC CAAGTATAAA 
10230 10240 10250 10260 10270 10280 10290 

X ScrFI 
x Mval 
X EcoRII 

x Eel 1361 x SphI 

FIGURE 13.23 
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>< DsaV x pael 

>< BstOI >< Nspl 

X BstMI >< NspHI 

>< BsiLI >< fonal >< Nlalll 

>< Apyl >< Mael X HphI 
TTTGTCCGTA TCCMCCTGG TCAAACATTT TCAGTTCTAG CATGCTACAA TGGTTCACCA TCTGGTGTTT 

10300 10310 10320 10330 10340 10350 10360 

>< Sau3AI 
>< Ndell 

x Mboix Nlalll 
>< Eco31I >< DpnII 

X BsmAI >< Tru9IX Dpnl 

x Bsalx Nlalll x Tru9I x Msel x Bspl4 3I 

X Alw26I x Msel X BspAlX Alwl 

ATCAGTGTGC CATGAGACCT AATCATACCA TTAAAGGTTC TTTCCTTAAT GGATCATGTG GTAGTGTTGG 
10370 10380 10390 10400 10410 10420 10430 

X 2sp2I 
>< PpulOI 

x Nsiix sfaNI 
x Ndel 

X Mphll03I Rsal x 

X Tru9I >< EcoT22I Csp6I X 

>< Ms el > < Avalll X Alul Afal X 

TTTTAACATT GATTATGATT GCGTGTCTTT CTGCTATATG CATCATATGG AGCTTCCAAC AGGAGTACAC 
10440 10450 10460 10470 10480 10490 10500 

X SinI 
x Sau96I 
x NspIV 

X NspHII >< sfcl 

X Eco47I R S aI >< 

x Cfrl3I PstI X 

>< BsiZI >< Fnu4HI 

x Rsal x Bmel8I x Hindi! Csp6I x 

X Csp6l>< Ddel x Avail x Hindi >< BspWI 

X AfalX Bf rl X AsuIX Bsgl X Bbvl x BspMI Afal >< 

GCTGGTACTG ACTTAGAAGG TAAATTCTAT GGTCCATTTG TTGACAGACA AACTGCACAG GCTGCAGGTA 
10510 10S20 10530 10540 10550 10560 10570 

X Tru9I X Nlalll 

X Msel x Bbvl >< Fnu4HI HphI x 

CAGACACAAC CATAACATTA AATGTTTTGG CATGGCTGTA TGCTGCTGTT ATCAATGGTG ATAGGTGGTT 
10580 10590 10600 10610 10620 10630 10640 

X Tru9I . . 
x Tfil 

x Msel >< RsaI 

>< HphI x Tru9I x Csp6I 

x Hinfl x Msel >< Afal 
TCTTAATAGA TTCACCACTA CTTTGAATGA CTTTAACCTT GTGGCAATGA AGTACAACTA TGAACCTTTG 

10650 10660 10670 10680 10690 10700 10710 

>< SinI 
>< Sau96I 
x Pssl 

x Psp5II 
X PpuMI 
X NspIV 

X NspHII 

>< NlalV 



FIGURE 13. 24 
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>< ECO0109I 
X Eco47I 
>< Sau3AI >< Drair 

x Ndell x C frl3I 

>< Mbol >< BsiZI 

>< Dpnlix Nlalll >< BscBI 

>< Dpnl x Hindu x Brael8I >< Ddel 

x BspAl x Hindi >< Avail >< 8 frl 

x Bspl43I >< Asul >< Mnll >< Bbvl 

ACACAAGATC ATGTTGACAT ATTGGGACCT CTTTCTGCTC AAACAGGAAT TGCCGTCTTA GATATGTGTG 
10720 10730 10740 10750 10760 10770 10780 

X Styl 
X Rsal 

X EcoT14I 
>< ECO130I' 
X Sfcl > < Csp6I 

X Fnu4HI X Fnu4HI >< BssTlI 

X Bbvl X Fnu4HI >< BsaJI 

X Bbvl >< Alul X PstI X Afal 

CTGCTTTGAA AGAGCTGCTG CAGAATGGTA TGAATGGTCG TACTATCCTT GGTAGCACTA TTTTAGAAGA 
10790 10800 10810 10820 10830 10840 10850 

X Styl 

X EcoTHI 

x Ecol30I 

X BssTlI 

X MboII > < HaelllX BsaJI 

TGAGTTTACA CCATTTGATG TTGTTAGACA ATGCTCTGGT GTTACCTTCC AAGGTAAGTT CAAGAAAATT 
10860 10870 10880 10890 10900 10910 10920 

>< SfaNI 

> < Sdul 

> < NspII >< Tru9I Rsal X 
X Tru9I> < Bspl286I x Msel x Tfil Csp6I X 
x Msel > < Brayl x Fokl x Hinfl Afal x 

GTTAAGGGCA CTCATCATTG GATGCTTTTA ACTTTCTTGA CATCACTATT GATTCTTGTT CAAAGTACAC 

10930 10940 10950 10960 10970 10980 109SO 

>< XmnI x Muni 

>< BsmI Fnu4HI > 

X BscCI BspWI X 

x Maelll >< Asp700I >< Bbvl Bbvl > 

AGTGGTCACT GTTTTTCTTT GTTTACGAGA ATGCTTTCTT GCCATTTACT CTTGGTATTA TGGCAATTCC 
11000 11010 11020 11030 11040 11050 . 11060 

X Nspl 

x NspHI x Tru9i 

x Nlalll x Msel x Bsral 

X BspWI X Fnu4HIX BspWI X BscCI X Maelll 

TGCATGTGCT ATGCTGCTTG TTAAGCATAA GCACGCATTC TTGTGCTTGT TTCTGTTACC TTCTCXTGCA 
11070 11080 11090 11100 11110 11120 11130 

>< SfaNI 
x Rmal 

> < Nspl x MamI 

> < Nlalll x HphI 

x Nhel >< BspHI 

>< Tru9I x Mael >< BsiBI x Nlalll 

x BspWI x Msel x Accl> < NspHIX Alul X BsaBI x Nlalll 

ACAGTTGCrT ACTTTAATAT GGTCTACATG CCTGCTAGCT GGGTGATGCG TATCATGACA TGGCTTGAAT 
11140 111S0 11160 11170 11180 11190 11200 

FIGURE 13.25 
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>< Tru9I 
>< Msel 

> < Rraal > < Esp4I 

> < Mael >< EcoS7I 

>< Alul > < AfJLII >< Alul 

TGGCTGACAC TAGCTTGTCT GGTTATAGGC TTAAGGATTG TGTTATGTAT GCTTCAGCTT TAGTTTTGCT 
11210 11220 11230 11240 11250 11260 11270 

>< Rmal • 

>< Maell 
>< Hael 

> < Nlalll x SfaNI X Fnu4HI 

>< BspHI X Alul >< Bbvl >< Afllll 

TATTCTCATG ACAGCTCGCA CTGTTTATGA TGATGCTGCT AGACGTGTTT GGACACTGAT GAATGTCATT 
11280 11290 11300 11310 11320 il330 11340 

>< Sau96I 
.! >< Pall 
X NspIV 
>< Nlalll 

>< Haelll 
>< Sau3AI > < odel 

>< Ndell. >< Cfrl3I 

>< Mbol >< BsuRI 

>< DpnII x BsiZI 

>< Dpnl X BshI 

X Bspl43I > < Bfrl 

>< ACCI >< BspAIX Alul X Asul 

ACACTTGTTT ACAAAGTCTA CTATGGTAAT GCTTTAGATC AAGCTATTTC CATGTGGGCC TTAGTTATTT 
11350 11360 11370 11380 11390 11400 11410 

>< Rmal 
X Nlalll 

x Mae I>< Sfcl 

X Maelll X Mnll x Maelll >< AlulX Alul 

CTGTAACCTC TAACTATTCT GGTGTCGTTA CGACTATCAT GTTTTTAGCT AGAGCTATAG TGTTTGTGTG 
11420 11430 11440 11450 11460 11470 11480 

Ddel > 

X BsrI >< Nlalll Bfrl > 

TGTTGAGTAT TACCCATTGT TATTTATTAC TGGCAACACC TTACAGTGTA TCATGCTTGT TTATTGTTTC 
11490 11500 11510 11520 11530 11540 11550 

x Pall 
x Haelll 
X Fnu4HI x BsuRI 
X Bbvl X Fnu4HI X BspWI 

x Bbvl x BspWI x BshI x Eco57l x Maelll 

TTAGGCTATT GTTGCTGCTG CTACTTTGGC CTTTTCTGTT TACTCAACCG TTACTTCAGG CTTACTCTTG 
11560 11570 115B0 11590 11600 11610 11620 

x ScrFI 
X Mval 

X EcoRII 
x EC1136I 

x Dsav 

X BstOI 
X BstNI 

X Eco31I x BsiLI 

x BsmAI > < BsaJI 

x Bsal >< BsaJI 

FIGURE 13. 26 
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>< DrdI >< Alw26I x Apyl Ddel x 

GTGTTTATGA CTACTTGGTC TCTACACAAG AATTTAGGTA TATGAACTCC CAGGGGCTTT TGCCTCCTAA 
11630 11640 11650 11660 11670 11680 11690 

X Tru9I 
>< Hsel 

>< SfaNI > < HindIII> < Tru9I 

>< Mnll >< Alul > < Msel > < Mnll > < Nlalll 

GAGTAGTATT GATGCTTTCA AGCTTAACAT TAAGTTGTTG GGTATTGGAG GTAAACCATG TATCAAGGTT 
11700 11710 11720 11730 11740 11750 11760 

>< Vnel 

>< Sdul 
>< NspII 
X HgiAI 
>< Bspl286I 

>< Bmyl X Rsal . . 
x Rsal >< ApaLI x MboII 

X Csp6I x Alw44I x Csp6I Ddel > 

X Afal X Maell X Alw21I X Afal Bfrl > 

GCTACTGTAC AGTCTAAAAT GTCTGACGTA AAGTGCACAT CTGTGGTACT GCTCTCGGTT CTTCAACAAC 
11770 11780 11790 11800 11S10 11820 11830 

X NspII> < Rsal 

X Drain 
X Sdulx Csp6I 
>< MboII x Bspl286I 

X Hinfl X Plel x Bmyl > < Afal X MboII 

TTAGAGTAGA GTCATCTTCT AAATTGTGGG CACAATGTGT ACAACTCCAC AATGATATTC TTCTTGCAAA 
11840 11850 11860 11870 11880 11890 11900 

X TthHBSI 

X TaqI Sfcl X 

x Hindlll x MboII x Nlalll 

X Alul > < Eco57I X BspWI Ace! X 

■ AGACACAACT GAAGCTTTCG AGAAGATGGT TTCTCTTTTG TCTGTTTTGC TATCCATGCA GGGTGCtGTA 
11910 11920 11930 11940 11950 11960 11970 

X Vspl 

X Tru9I > < Ksp632I 

x Msel x TthHB8I > < Earl 

X Asm X TaqI X MboII > < Eamll04I 

X Aseix Mnll x Bcgl/a x Eco57I >< Eco57I >< Bcgl 

GACATTAATA GGTTGTGCGA GGAAATGCTC GATAACCGTG CTACTCTTCA GGCTATTGCT TCAGAATTTA 
11980 11990 12000 12010 12020 12030 12040 

x StuI 
x ScrFI 

x Pall 
x Mvaix Haelli 
x EcoRIIx EC0147I 

x Ecll36I 
x DsaV x BsuRI 
X BstOI 
x BstNI 

X BspWI 
X BsiLI 

x Fnu4HI x BsaJI X BshI Tfil X 

x Ndel x BspWI x Mall x Bgll x Sfcl Hinfl x 

>< Acil x Apylx AatI > < Alul 



FIGURE 13. 27 
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GTTCTTTACC ATCATATGCC GCTTATGCCA CTGCCCAGGA GGCCTATGAG CAGGCTGTAG CTAATGGTGA 
120SO 12060 12070 12080 12090 12100 12110 

>< XmnI >< Tru9I >< SfaNI 

>< HphI >< Msel >< Ddel 

X Asp700l >< Eco57I >< Bbvl Fnu4HI >< 

TTCTGAAGTC GTTCTCAAAA AGTTAAAGAA ATCTTTGAAT GTGGCTAAAT CTGAGTTTGA CCGTGATGCT 
12120 12130 12140 12150 12160 12170 12180 

XhoII >< 
Sau3AI >< 
Ndell >< 

Mnll > 
>< Mnll 
X Mfll 

> < Sau3AI >< Mbol 

> < Ndell DpnII x 

> < Mbol Dpnl >< 

> < DpnII Ddel x 

X Dpnl BstYI X 

X BspWI X RsalBspAI X 

> < BspAI X Csp6IBspl43I X 
X Nlalll x Bspl43I X AfalBglll X 

GCCATGCAAC GCAAGTTGGA AAAGATGGCA GATCAGGCTA TGACCCAAAT GTACAAACAG GCAAGATCTG 
12190 12200 12210 12220 12230 12240 12250 

X Spel X Ksp632I > < HindHI 

>< Rmal x Ddel x SfaNI 

X Maelll X MboII x Eamll04I X BspWI 

x Mael x BspWI x Earix Bfrl x Alul 

AGGACAAGAG GGCAAAAGTA ACTAGTGCTA TGCAAACAAT GCTCTTCACT ATGCTTAGGA AGCTTGATAA 
12260 12270 12280 12290 12300 12310 12320 

x Thai 

x Mvnl 
x HinPlI 
x Hin6I 

X Hhal 

X Cfol 

X BstOI 

X Tru9I x BspSOI 

>< Msel X AccII Sfcl X 

TGATGCACTT AACAACATTA TCAACAATGC GCGTGATGGT TGTGTTCCAC TCAACATCAT ACCATTGACT 
12330 12340 12350 12360 12370 12380 12390 

>< Rsal 
x NlalV 
x Eco64I 
x Csp6I 
x BslI 

x BsiYIx Kpnl 
x BscBI 
x Banl 
x Asp71B 

>< Nlalll x Afal 

X BstXI x AccBlI >< Maelll 

X Fnu4HI X Bbvl x Acc65I Bsgl X 

ACAGCAGCCA AACTCATGGT TGTTGTCCCT GATTATGGTA CCTACAAGAA CACTTGTGAT GGTAACACCT 
12400 12410 12420 12430 12440 12450 12460 

X Zsp2I 
X PpulOI 
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>< Msil 

>< Mphll03I 

>< Ndelx EcoT22I Ddel >< 

X AvallX x SfaNI >< SfaNI >< Acil Bfrl x 

TTACATATGC ATCTGCACTC TGGGAAATCC AGCAAGTTGT TGATGCGGAT AGCAAGATTG TTCAACTTAG 
12470 12480 12490 12500 12510 12520 12530 

X Pall 

>< Haelll X Mnll X DdelDdel X 
x Tru9I>< Nlalll x BsuRI X Maelll X BspWI 

x Mseix HphI > < Xcralx BshI >< Alul BspWI X 

TGAAATTAAC ATGGACAATT CACCAAATTT GGCTTGGCCT CTTATTGTTA CAGCTCTAAG AGCCAACTCA 
12540 12550 12560 12570 12580 12590 12600 

Rsal X 
NlaXV X 
Kpnl X 
X Fnu4HI 
Eco64I x 
CspSI X 

>< Tru9I BscBI >< 

>< PvuH Asp718 X 

X PspSI AfaI ^ 

X NspBII >< Ac ii>< Banr 

>< Msel x Hinfl X Plel AccBlI x 

x Alul > < Sfcl x Ddeix BsrI >< PshAI Acc65I x 

GCTGTTAAAC TACAGAATAA TGAACTGAGT CCAGTAGCAC TACGACAGAT GTCCTGTGCG GCTGGTACCA 
12610 12620 12630 12640 12650 12660 12670 

X TthHBBI 
>< TaqI 
x Sful 
x NspV 
>< Mnll 
X Lspl 
X Csp45I 
X BstBI 

x Rsal x Bs P 119I 

>< Csp6T >< BsiCl 

X Alul X Bpul4I 

x AfaI x AsuII 

CACAAACAGC TTGTACTGAT GACAATGCAC TTGCCTACTA TAACAATTCG AAGGGAGGTA GGTTTGTGCT 
. 12680 12690 12700 12710 12720 12730 12740 

x XhoII 
X Sau3AI 
X Ndell 
X Mfll 
X Mbol 
X DpnII 
x Dpnl 

x BstYI x Tfil x Rsal 

X BspAI x Rraal >< Csp6I 

x Bspl4 3I x Hinfl x Csp6IX Rsal 

x Bglll x Mael x Ddel x AfalX AfaI 

GGCATTACTA TCAGACCACC AAGATCTCAA ATGGGCTAGA TTCCCTAAGA GTGATGGTAC AGGTACAATT 
12750 12760 12770 12780 12790 12800 12810 

X Sau96I- 

X Pssl 
X Pall 
>< NspIV 
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>< Haelll 
>< ECOO109I 
>< Drall 
>< Cfrl3I 

>< BsuRI 

Nla * v >< BsiZI Rsal > 

><: BsrI >< BshI Csp6I x 

>< BscBI > < MaelH >< Asul Afal > 

TACACAGAAC TGGAACCACC TTGTAGGTTT GTTACAGACA CACCAAAAGG GCCTAAAGTG AAATACTTGT 
12820 12830 12840 12850 12860 12870 12880 

x Sfol 

> < MboII 
Maell >< 
x Fnu4HI >< Rsal 
>< Eco57I x Csp6I 
x Tr "9I > < Bbsl 

X Msel x Mnll >< Bbvl x Alul X Afal 

ACTTCATCAA AGGCTTAAAC AACCTAAATA GAGGTATGGT GCTGGGCAGT TTAGCTGCTA CAGTACGTCT 
12890 12900 12910 12920 12930 12940 12950 

X Rsal 
X Sfcl >< Csp6I 
>< BspWI x Afal >< BspMI AccI >< 

TCAGGCTGGA AATGCTACAG AAGTACCTGC CAATTCAACT GTGCTTTCCT TCTGTGCTTT TGCAGTAGAC 
12960 12970 12980 12990 13000 13010 13020 

x Mnll 
X Mael x HphI 

CCTGCTAAAG CATATAAGGA TTACCTAGCA AGTGGAGGAC AACCAATCAC CAACTGTGTG AAGATGTTGT 
13030 13040 13050 13060 13070 13080 13090 

x SinI 
X Sau96I 
x NspIV 

X NspHII 
X Nlalll 
X Eco47l 

X Eamll05I 

^ „ , x Cfrl3I. 

X Rsal X Rsar >< B s izr 

X MboII X Csp6I >< Bmel81 >K Xcml 

x Csp€I >< BsrI >< Avail Plel x 

X Afal X Afal x Maelll X Alul X Asul> < Hlnfl 

GTACACACAC TGGTACAGGA CAGGCAATTA CTGTAACACC AGAAGCTAAC ATGGACCAAG AGTCCTTTGG 
13100 13110 13120 13130 13140 13150 13160 

X Tfil 

SfaN! MaeI „ 
x Nlalll x Fokl >< Hinfl 

TGGTGCTTCA TGTTGTCTGT ATTGTAGATG CCACATTGAC CATCCAAATC CTAAAGGATT CTGTGACTTG 
13170 13180 13190 13200 13210 13220 13230 

> < Rsal 
x Maell 

X Csp6I >< pde! 

> < Afal x BsrI x Bfrl 
AAAGGTAAGT ACGTCCAAAT ACCTACCACT TGTGCTAATG ACCCAGTGGG TTTTACACTT AGAAACACAG 

13240 13250 13260 13270 13280 13290 13300 



FIGURE 13.30 



>< Thai 
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>< Rsal 



>< SfaNI 
>< Mvnl 
>< BstOI 
X BspSOI 

-< <=sp6I >< RciI 

>< Afal >< Acil >< sfcl x Maelll x AccIISfaNI x 

TCTGTACCGT CTGCGGAATG TGGAAAGG7T ATGGCTGTAG TTGTGACCAA CTCCGCGAAC CCTTGATGCA 
13310 13320 13330 13340 13350 13360 13370 

>< Zsp2l 

> < SfaNI 
X Mphll03I>< Tru9I 
x PpulOIX Haell Fnu4HI X 

x Nsil> < Fokl Bsgl x 

x EcoT22I X Msel >< Bbvl 

x Acilx Avalll >< Oral x Acil X Fnu4HI Acil x 

GTCTGCGGAT GCATCAACGT TTTTAAACGG GTTTGCGGTG TAAGTGCAGC CCGTCTTACA CCGTGCGGCA 
13380 13390 13400 13410 13420 13430 13440 

X Spel 

X Seal 

x Rsal 
x Rraal 
x Mael 

> < Csp6I X Sfcl >< BspWI 

X BspWI X Afal X AccI X Bcgl/a Bc gl > 

CAGGCACTAG TACTGATGTC GTCTACAGGG CTTTTGATAT TTACAACGAA AAAGTTGCTG GTTTTGCAAA 
13450 13460 13470 13480 13490 13500 13510 

X ScrFI 
x Mval 

x Mnll 
X EcoRII 
>< Ecll36I 
X BstOI 
>< BstNI 

X BslI 
x DsaV x Bsitt 

>< BsiLI >< plei 

>< Apyl > < Fokl X Hinfl 

GTTCCTAAAA ACTAATTGCT GTCGCTTCCA GGAGAAGGAT GAGGAAGGCA ATTTATTAGA CTCTTACTTT 
13520 13530 13540 13550 13560 13570 13580 

x Nlalll 
X Ksp632I 
X Earl 

X Tru9l >< Eamll04I 

>< Msel >< BsmAI >< Tru9I 

>< Mn ^ x Alw26I x MboII >< Msel 

GTAGTTAAGA GGCATACTAT GTCTAACTAC CAACATGAAG AGACTATTTA TAACTTGGTT AAAGATTGTC 
13590 13600 13610 13620 13630 13640 13650 

x Rsal 
>< NlalV 

> < Nlalll 

x Kpnl 
>< HphI 

> < €co64I 
X Csp6I 

X BscBI 

> < BanI 

> < Asp718 
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X Maelll >< Afal 
>< NspBll > < AecBll Maell x 

x Acil >< main > < Acc65I > < HgaI 

CAGCGGTTGC TGTCCATGAC TTTTTCAAGT TTAGAGTAGA TGGTGACATG GTACCACATA TATCACGTCA 
13660 13670 13680 13690 13700 13710 13720 

X Mnll 
X Haell 

GCGTCTAACT AAATACACAA TGGCTGATTT AGTCTATGCT CTACGTCATT TTGATGAGGG TAATTGTGAT 
13730 13740 13750 13760 13770 13780 13790 

X Tru9I 

X Msel X Maelll >< Muni 

ACATTAAAAG AAATACTCGT CACATACAAT TGCTGTGATG ATGATTATTT CAATAAGAAG GATTGGTATG 
13800 13810 13820 13830 13840 13850 13860 

x Thai 
x Mvnl 
X Mlul 

X BstOI u==T 
X BspSOI 

X Tfil >< AfllH >< Odel 

>< Hinfl >< Accll >< Bfrl >< A rai Msel >< 

ACTTCGTAGA GAATCCTGAC ATCTTACGCG TATATGCTAA' CTTAGGTGAG CGTGTACGCC AATCATTATT 
13870 13880 13890 13900 13910 13920 13930 

XhoII > 
Sau3AI > 
Ndell > 
Mfll > 
Mbol > 
Dpnll > 
BstYI > 
BspAI > 



x HphI 
X Csp6I Tru9I X 
x Afal Msel >< 



> < SfaNl 
>< Rsal 
< Csp6I 
x Afar 



x Rsal 
> < Csp6I 
X BspWI 

Afal 



-~ ^ jiUOi >«; Atal BspAI > 

AAAGACTGTA CAATTCTGCG ATGCTATGCG TGATGCAGGC ATTGTAGGCG TACTGACATT AGATAATCAG 
13940 - ~— - 



13950 



13960 



13970 



13980 



13990 



14000 



> < ScrFI 

> < Mval 
X Fnu4HI 

X EcoEUI 

> < Ecll36I 

> < Bstor 

> < BstNI 
>< Rsal >< BslI 

X Rsal > < HphI >< BsiYI 

X Csp6I >< Csp6I > < BsiLI 

>< BsrI > < Bbvl > < Apyl 

x Afal >< Afal >< DsaV >< Acil 

GATCTTAATG GGAACTGGTA CGATTTCGGT GATTTCGTAC AAGTAGCACC AGGCTGCGGA GTTCCTATTG 
14O10 14020 14030 14040 14050 14060 14O70 

>< SfaKI 
x Rmal > < Hinfl 

>< Mara! >< Mnll >< Fnu4HIPleI x 

X Tfil >< SfaNI >< BsiBI >< Mael >< Ddel 

X Hinfl X Fokl >< BsaBI >< Bbvl >< BspWI Ndel >< 

TGGATTCATA TTACTCATTG CTGATGCCCA TCCTCACTTT GACTAGGGCA TTGGCTGCTG AGTCCCATAT 
14080 14090 14100 14110 14120 14130 14140 

>< Sau3AI 



X Tru9I 

X Msel 
e Dpnl 
c Bspl43I 

c Aiwr 



FIGURE 13.32 
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>< Mbol 
>< Mazol 

>< Dpnil Tthllll >< 

>K °PnI Mbon x 

>< Bs P«I X Ksp632I 

X BspAI >< EamllCMI 

>< Bspl43l >< xeral x BsmAI 

>< Bsi BI >< Tru9I x Earl Aspl >< 

>< BsaBI X Fokl XMsel >< Alw26I 

GGATGCTGAT CTCGCAAAAC CACTTATTAA GTGGGATTTG CTGAAATATG ATTTTACGGA AGAGAGACTT 

14150 14160 14170 14180 14190 14200 14210 

> < Sim 

> < Sau96I 

> < NspIV 
>< NspHII 

>< TthHB8l >< NlalV 

>< TaqI >< Fokl 

>< Mcrl > < Eco47I 

> < Ks P 632I > < Cfrl3I 

> < Earl > < Bsi2I 

> < Eamll04I >< SspIX BscBI 
X BsmAI > < Tru9I > < Bmel8I 

• X Mboll >< BsiEI> < Msel > < Avail >< Tru9I 

>< Alw26I x. Dral > < Asul >< Muni x Msel 

TGTCTCTTCG ACCGTTATTT TAAATATTGG GACCAGACAT ACCATCCCAA TTGTATTAAC TGTTTGGATG 
14220 14230 14240 14250 14260 14270 14280 

SinI x 
Sau96l X 
NspIV x 
NspHII > 
Eco47I >< 
Cfrl3I X 
BsiZI X 
Bmel8I x 

x Tru9I Avail x 

>< Fokl >< Msel Asul x 

ATAGGTGTAT CCTTCATTGT GCAAACTTTA ATGTGTTATT TTCTACTGTG TTTCCACCTA CAAGTTTTGG 
14290 14300 14310 14320 14330 14340 14350 

X Spel 
X Rmal 

X Mael >< Sspl >< BsrI 

ACCACTAGTA AGAAAAATAT TTGTAGATGG TGTTCCTTTT GTTGTTTCAA CTGGATACCA TTTTCGTGAG 
14360 14370 14380 14390 14400 14410 14420 

X ThalX Esp3I 

x Ddel 
>< BstUI 

>< Rsal >< Bsp50I x BsmBI 

x Hinfl >< Plel >< Mvnix BsmAI 

> < Csp6I >< Hgalx Alul X Alw26l 

x Afal >< Fokl x AccII > < Bbvl 

TTAGGAGTCG TACATAATCA GGATG TAAAC TTACATAGCT CGCGTCTCAG TTTCAAGGAA CTTTTAGTGT 
14430 14440 14450 14460 14470 14480 14490 

X Zsp2I 
X SphI 
>< PpulOI 

>< Pael 
>< Nspl 

FIGURE 13.33 
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>< Sau3AI >< NspHI 

>< Ndell X Nsil 

>< Mbol >< Nlalll 

X DpnII >< Mphll03I X Nspl 

> < Dpnl >< Fnu4HI NspHI >< 
>'< Fnu4HIX BspWI >< EcoT22I Nlalll >< 

>< BspAI >< BspWI >< BspWI 

> < Bspl43I> < Avalll > < AlwNI >< Rraal - >< Bsgl 
>< Alwl >< Alul >< Alul >< Bbvl >< Mael >< Bbvl 

ATGCTGCTGA TCCAGCTATG CATGCAGCTT CTGGCAATTT ATTGCTAGAT AAACGCACTA CATGCTTTTC 

14500 14510 14520 14530 14540 14550 .14560 

x ScrFI 
>< Neil 
X Mspl 
X Hpall 

X Fnu4HI >< HapII 

x AlwNI >< DsaV >< Tru9I 

x Alul x Bcnl x Msel 

AGTAGCTGCA CTAACAAACA ATGTTGCTTT TCAAACTGTC RAACCCGGTA ATTTTAATAA AGACTTTTAT 
14570 14580 14590 14600 14610 14620 14630 

X Tru9I Ddel X 

>< Msel >< MboII Bbvl X 

GACTTTGCTG tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 

.14640 14650 14660 14670 14680 14690 14700 

X Fokl EcoRV X 

X Fnu4HI Eco32I X 

AGGATGGCAA CGCTGCTATC AGTGATTATG ACTATTATCG TTATAATCTG CCAACAATGT GTGATATCAG 
14710 14720 14730 14740 14750 14760 14770 

X Vspl 
>< Tru9I 
X Msel 

x Asm 

>< Maelll X Asel 

ACAACTCCTA TTCGTAGTTG AAGTTGTTGA TAAATACTTT GATTGTTACG ATGGTGGCTG 7ATTAATGCC 
14780 14790 14800 14810 14820 14B30 14840 

X Tru9I 

X Msel x PvuII 

x Hpal x Psp5I > < Xcml . 

X Hindu X NspBII X Tru9I Rmal X 

x Hindi x Alul x Msel Mael X 
AACCAAGTAA TCGTTAACAA TCTGGATAAA TCAGCTGGTT TCCCATTTAA TAAATGGGGT AAGGCTAGAC 

14850 14860 14870 14880 14890 14900 14910 

>< SfaNI x Thai 

x Sau3AI >< Mvnl 

x Ndell X BstOI 

X Mbol >< Bstll07I 

X DpnII >< BspWI X Fokl 

>< Dpnl >< Bsp50I 

X Plel >< Bspl43I x Acclix Ddel 

X Hinf IX Mnll x BspAI x Alwl x AccI 

TTTATTATGA CTCAATGAGT TATGAGGATC AAGATGCACT TTTCGCGTAT ACTAAGCGTA ATGTCATCCC 

14920 14930 14940 14950 14960 14970 14980 

X SstI 
X Sdul 
X Sad 

FIGURE 13.34 
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>< NspII 
>< HgiAI 
>< Eco24I 

>< Tru9I > < Ecll36II 

>< Tfil X Bspl286I 

>< Msel >< Brayl 

>< Hinfl >< Banll 

> < Esp4I >< Alw21I 

> < Aflll >< BspWI > < Alul >< Alul 
TACTATAACT CAAATGAATC TTAAGTATGC CATTAGTGCA AAGAATAGAG CTCGCACCGT AGCTGGTGTC 

14990 15000 15010 15020 15030 15040 15050 

>< Seal > < Mn ii 

>< SfcIX Rsal Hael X 

>< BsmAI X Csp6I X Fnu4HI 

X Alw26I X Afal x Acil 

TCTATCTGTA GTACTATGAC AAATAGACAG TTTCATCAGA AATTATTGAA GTCAATAGCC GCCACTAGAG 

15060 1507O 15080 15090 15100 15110 15120 

>< Tru9I 

>< Alul >< Ms el 

GAGCTACTGT GGTAATTGGA ACAAGCAAGT TTTACGGTGG CTGGCATAAT ATGTTAAAAA CTGTTTACAG 
15130 15140 15150 15160 15170 15180 15190 

Nspl X 
NspHI X 
Nlalll X 
>< Nlalll 

.Ddel x 
BspWI x 
x Maelll Bfrl x 

TGATGTAGAA ACTCCACACC TTATGGGTTG GGATTATCCA AAATGTGACA GAGCCATGCC TAACATGCTT 
15200 15210 15220 15230 15240 15250 15260 

> < Pall 

> < Haelll 

> < BsuRI 

> < Bshl x Mnll x Maelll Sfcl X 
AGGATAATGG CCTCTCTTGT TCTTGCTCGC AAACATAACA CTTGCTGTAA CTTATCACAC CGTTTCTACA 

15270 15280 15290 15300 15310 15320 15330 

Tru9I X 

ScrFI > 
Mval > 
X Msel 

x MstI pokl X 

X HinPlI EcoRII X 

X Hin61 Edl36I > 

> < Hhal DsaV x 
>< fspl BstOI > 
X Fdill x Nlalll BstNI > 

> < CfoIX Tru9I > < Fnu4HI BsiLI > 
X Alul X Avill x Msel x Acil Apyl > 

GGTTAGC7AA CGAGTGTGCG CAAGTATTAA GTGAGATGGT CATGTGTGGC GGCTCACTAT ATGTTAAACC 
15340 15350 15360 15370 15380 15390 15400 

> < SfaNI 

X Mspl 

X Hpall x HphI 

x HapII x BspWI 



FIGURE 13.35 
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AGGTGGAACA TCATCCGGTG ATGCTACAAC TGCTTATGCT AATAGTGTCT TTAACATTTG TCAAGCTGTT 
15410 15420 15430 15440 15450 15460 . 15470 

p< Bs P WI >< Alul Dr > I <AciI 

ACAGCCAATG TAAATGCACT TCTTTCAACT GATGGTAATA AGATAGCTGA CAAGTATGTC CGCAATCTAC 
15480 15490 15500 15510 15520 15530 15540 

>< Sau3AI 
>< Ndell 
>< Mbol 
> < HamI 

>< Fbal 
>< DpnII 

>< Dpnl 
>< BspHI 
>< BspAI 

>< Bspl43I 
>< BsiQI 

>< Sfcl > < BsiBIX Nlalll 

X BsmAI > < BsaBIX Fokl 

X Alw26I >< Bclrx EcoRI Fokl x 

AACACAGGCT CTATGAGTGT CTCTATAGAA ATAGGGATGT TGATCATGAA TTCGTGGATG AGTTTTACGC 
15550 15560 15570 15580 15590 15600 15610 

X Tfil 

>< SfaNI 
x Nlalll 

X BspMI >< Hinfl >< Maelir 

TTACCTGCGT AAACATTTCT CCATGATGAT TCTTTCTGAT GATGCCGTTG TGTGCTATAA CAGTAACTAT 
15620 15630 1S640 15650 15660 15670 15680 

> < Rraal 
x Nhel x Tru9I 
X Fnu4HI > < Mael >< Tru91 

>K A =il X Alul x Msel x Msel »nij >< 

GCGGCTCAAG GTTTAGTAGC TAGCATTAAG AACTTTAAGG CAGTTCTTTA TTATCAAAAT AATGTGTTCA 
15690 . 15700 15710 15720 15730 15740 . 15750 

x SinI 
x Sau96I 
x PssI 
X PspSII 
>< PpuMI 
X NspIV 

X NspHII 
>< Eco0109I 
x Eco47I 
X Drall 
x Cfrl3I 
X BsiZI 
x Ddel >< Bmel8I 

x Nlalll >< BsmAI >< Avail 

x Ddel X Alw26I >< Asul >< Mali 

TGTCTGAGGC AAAATGTTGG ACTGAGACTG ACCTTACTAA AGGACCTCAC GAATTTTGCT CACAGCATAC 
15760 15770 15780 15790 15800 15810 15820 

>< XhoII 
>< Sau3AI 
x Ndell 
x Mfll 
x Mbol 

FIGURE 13. 36 
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>< .Rsal x DpnII 

x Maell >< Dpnl > < Sspl 

X Tru9I >< Csp6I >< BstYI HinPlI >< 

>< Rmal >< BsaAI >< BspMI Hin6I X 

>< Mael >< Afllll >< BspAI Hhal X 

X BspWIX Msel x Afal x AlwIX Bspl43I Cfol x 

AATGCTAGTT AAACAAGGAG ATGATTACGT GTACCTGCCT TACCCAGATC CATCAAGAAT ATTAGGCGCA 
15830 15840 15850 15860 15870 15880 15890 

x Rsal >< sfaHI 

x TthHBSI x Csp6I >< Maelli 

X TaqI • x Afal BsrI x 

GGCTGTTTTG TCGATGATAT TGTCAAAACA GATGGTACAC TTATGATTGA AAGGTTCGTG TCACTGGCTA 
15900 15910 15920 15930 15940 15950 15960 

> < Fokl . 
x BspWI 

TTGATGCTTA CCCACTTACA AAACATCCTA ATCAGGAGTA TGCTGATGTC TTTCACTTGT ATTTACAATA 
15970 15980 . 15990 16000 16010 16020 16030 

X Van91I 
>< PflMI 
x Nspl 

> < Pal IX NspHI 

> < Msclx Nlalll 

> < Haelll 

> < BsuRI 
x BsrI 

X Eael >< BslI x Nspl 

> < BshlX BsiYI X NspHI 

x main x Afini x Afiin 

x Maelli x Alul > < BallX AccB7I x Nlalll 

CATTAGAAAG TTACATGATG AGCTTACTGG CCACATGTTG GACATGTATT CCGTAATGCT AACTAATGAT 
16040 16050 16060 16070 16080 K090 16100 

X Rsal> < NlalV 
X Mnll 

X Csp6I x Ddel x Rsal 

x BsrI x Mnll x csp6I 

X Afal> < BscBI X Afal Sfcl X 

AACACCTCAC GGTACTGGGA ACCTGAGTTT TATGAGGCTA TGTACACACC ACATACAGTC TTGCAGGCTG 

16110 16120 16130 16140 16150 16160 16170 

x NlalV 

X EcoNI 
X Eco31I 
X Eco64I>< BsraAI 

>< BscBI X BslI 
x Ban I x BsiYI 
x Acil x Bsal 

>< B SPWI >< AccBlIX Alw26I Bbvl X 

TAGGTGCTTG TGTATTGTGC AATTCACAGA CTTCACTTCG TTGCGGTGCC TGTATTAGGA GACCATTCCT 
16180 16190 16200 16210 16220 16230 16240 

X TthlUI 

X Fnu4HI x Nlalll > < Tru9I 

X BspWI X Aspl > < Msel 

ATGTTGCAAG TGCTGCTATG ACCATGTCAT TTCAACATCA CACAAATTAG TGTTGTCTGT TAATCCCTAT 

16250 16260 16270 16280 16290 16300 16310 



FIGURE 13.37 
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>< ecorii 

X Eel 1361 
>< DsaV 

>< BstOI 
>< BstNI 

>< BsiLI >< Rmai 

>< B saJl >< Mnll BspWI x 

>< Apyl >< Maelll >< Maelll X Mael >< A lul 

GTTTGCAATG CCCCAGGTTG TGATGTCACT GATGTGACAC AACTGTATCT AGGAGGTATG AGCTATTATT 

16320 16330 16340 16350 16360 16370 16380 

X Maelll ■ >< Hnll 

GCAAGTCACA TAAGCCTCCC ATTAGTTTTC CATTATGTGC TAATGGTCAG GTTTTTGGTT TATACAAAAA 
16390 16400 16410 16420 16430 16440 16450 

>< Nspl >< NspI 

X NspHI > < Tthllll >< NspHI 

X Nlalllx Maeinx Maelll >< Nlalll 

X Afllll x Aspl x Afllll 

CACATGTGTA GGCAGTGACA ATGTCACTGA CTTCAATGCG ATAGCAACAT GTGATTGGAC TAATGCTGGC 
16460 16470 16480 16490 16500 16510 16520 

x Rsal 
x Plel 
x Ddel 
x Csp6I 

X BsmAI x Hinfl x Mnll 

x Alw26I . x Hindi II Ddel x 

X Afal x Alul x Fnu4HI X Bbvl 

GATTACATAC TTGCCAACAC TTGTACTGAG AGACTCAAGC TTTTCGCAGC AGAAACGCTC AAAGCCACTG 
16530 16540 165S0 16560 16570 16580 • 16590 

> < Thai 

>< Seal 
>< Rsal x Rsal 

> < Mvnl 

X Csp6I X Csp6I . 

> < BstOI 

> < Tru9I > < Bsp50I 

> < Msel > < Ndel x Afal x Afal 

Alul > < AccII Mnll > 

AGGAAACATT TAAGCTGTCA TATGGTATTG CCACTGTACG CGAAGTACTC TCTGACAGAG AATTGCATCT 
16600. 16610 16620 16630 16640 16650 16660 

Maelll X 
>< Maelll 
x Eco0651 
X Eco91I 
x BstPI 

x SfaNI x Rmai >< BstEII 

x Nlalll x Mael >< BsrI 

TTCATGGGAG GTTGGAAAAC CTAGACCACC ATTGAACAGA AACTATGTCT TTACTGGTTA CCGTGTAACT 
16670 16680 16690 16700 16710 16720 16730 

Rsal x 
X Mnll 

x Rsal x Rsal >< HphI 

X Csp6I x Csp6I x SfaNI Csp6I X 

x Afal x Afal x Maelll x. HphI Afal x 

AAAAATAGTA AAGTACAGAT TGGAGAGTAC ACCTTTGAAA AAGGTGACTA TGGTGATGCT GTTCTGTACA 
16740 16750 16760 16770 16780 16790 16800 
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>< Rsal >< HphI 

>< Csp6I >< Hindll Ddel x 

>< Afal >< Hindi Bfrl >< 

GAGGTACTAC GACATACAAG TTGAATGTTG GTGATTACTT TGTGTTGACA TCTCACACTG TAATGCCACT 

16810 16820 16830 16840 16850 16860 16870 

>< Vnel 
>< Snol 

>< Sdul 

>< NspII 

>< HgiAI > < sdul 

X Drain. > < NspII 

>< Bspl286I > < HgiAI 

>< Brayl >< BspHI >< Oralll >< Rsal 

X ApaLI >< Rraal > < Bspl286I x Csp6I 

x Alw44I x Mael > < Bmyl >< BsrI 

X Alw2ll > < Alw21I >< Afal Ddel > 
TAGTGCACCT ACTCTAGTGC CACAAGAGCA CTATGTGAGA ATTACTGGCT TGTACCCAAC ACTCAACATC 

16880 16890 16900 16910 16920 16930 16940 

Styl X 
Sitil > 
Sau96I > 
NspIV > 
ECOT14I X 

EC047I > 
ECO130I X 
X Seal Cfrl3I > 
BssTlI X 

x Sphi >< Rsal BsiZI > 
x Pael BsaJI x 

>< Nlalll Bmel8I > 

>< Rmal x Nspix Csp6I Avail > 

>< wael >< NspHIX Afal Asul > 

TCAGATGAGT TTTCTAGCAA TGTTGCAAAT TATCAAAAGG TCGGCATGCA AAAGTACTCT ACACTCCAAG 
16950 16960 16970 16980 16990 17000 17010 



>< EcoRII 
X Eel 1361 

> < Csp6l 
X BstOI 
X BstNI 
x Xcml x BslI 
X NspHII x BsiYI 
X BsiLI 

X Apyl >< BsrI 
x DsaVX Afal > < Hinfix Plel 
GACCACCTGG TACTGGTAAG AGTCATTTTG CCATCGGACT TGCTCTCTAT TACCCATCTG CTCGCATAGT 
17020 17030 17040 17050 ' 17060 17070 17080 

X SfaNI 
X SphI x PvuII 

>< Pael >< Psp5I 

x Nspl x NspBII 

x NspHI x Fnu4HI > < Tru9I 

x Bstll07I > < Nlalllx BspWI x Sspl 

x Accl x Mlalli x Alul x Bbvi > < Msel 

GTATACGGCA TGCTCTCATG CAGCTGTTGA TGCCCTATGT GAAAAGGCAT TAAAATATTT GCCCATAGAT 
17090 17100 17110 17120 17130 17140 17150 



FIGURE 13.39 
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> < Thai 
>< Thai 

> < Mvnl 
x Mvnl X Thai 

> < HinPlI 
>< HinPlI 

>< HinPlI >< Mvnl 

> < Hin6I 
>< Hin6I 

> < Hhal 
>< Hhal >< Hhal 

> < Cfol 
>< Cfol X Cfol 

> < BstOI 
X BstUI X BstOI 

x BssHII 
>< BspMI 

> < Bsp50I 

x Bsp50I>< BspSOI taI > 

X Tfil >< Hin6I> < AccII MaeI > 

X Hinfl >< AccII >< AccII > < EcoRI 

AAATGTAGTA GAATCATACC TGCGCGTGCG CGCGTAGAGT GTTTTGATAA ATTCAAAGTG AATTCAACAC 
17160 17170 17180 17190 17200 17210 17220 

X 2sp2I 
X PpulOI 

X Nsil 

X Mphll03I 

X ECOT22I 

>c B ^ X > < Avalll >< 0rdI 

TAGAACAGTA TGTTTTCTGC ACTGTAAATG CATTGCCAGA AACAACTGCT GACATTGTAG TCTTTGATGA 
17230 17240 17250 17260 17270 17280 17290 

X Rmal 

AATCTCTATG GCTACTAATT ATGACTTGAG TGTTGTCAAT GCTAGACTTC GTGCAAAACA CTACGTCTAT 
173O0 17310 17320 17330 17340 17350 17360 

X Sau3AI 
X Ndell 
>< Mbol 
X OpnII 
X Dpnl 

><Z Bs P AI x Rmal 

X AlwIX 8 S pl43I > < Acij >c Mael s >K 

ATTGGCGATC CTGCTCAATT ACCAGCCCCC CGCACATTGC TGACTAAAGG CACACTAGAA CCAGAATATT 

17370 17380 17390. 17400 17410 17420 17430 

x SinI 
x Sau96I 

>< NspIV >< styl 

X NspHII X Nspl 

X Eco47I >< NspHI 

X Cfrl3l x Nlalll 

>< BsiZI x EcoPHI 

x Bsgl x ECO130I 

^ x Bmel8I x BssTlI 

><c Tru91 X Avail >< BsaJI 

>K MseI x Asul> < Afllll 

TTAATTCAGT GTGCAGACTT ATGAAAACAA TAGGTCCAGA CATGTTCCTT GGAACTTGTC GCCGTTGTCC 

17440 17450 17460 • 17470 17480 17490 17500 

FIGURE 13. 40 
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>< Hindi! 

>< Hindi >< Alul 

TGCTGAAATT GTTGACACTG •TGAGTGCTTT AGTTTATGAC AATAAGCTAA MGCACACAA GGATAAGTCA 
17510 1-7520 17530 17540 17550 17560 17S70 

>< A1 «i >< main 

GCTCAATGCT TCAAAATGTT CTACAAAGGT GTTATTACAC ATGATGTTTC ATCTGCAATC AACAGACCTC 
17580 17590 17600 17610 17620 17630 17640 



C HphI 



>< Mnll 
>< EcoNI 

>< ««« x 
>K BsiYI >< Alul 

AAATAGGCGT TGTAAGAGAA TTTCTTACAC GCAATCCTGC TTGGAGAAAA GCTGTTTTTA TCTCACCTTA 
17650 17660 17670 17680 17690 17700 17710- 

>< Sfcl >< Ddel >< Tfil 

> < Alul >< Bfrl >< Hinfl 

TAATTCACAG AACGCTGTAG CTTCAAAAAT CTTAGGATTG CCTACGCAGA CTGTTGATTC ATCACAGGGT 
17720 17730 17740 17750 17760 17770 17780 

> < Hindll 

>< nhim > < HincII 

>< Aspl !.< Aci j 

TCTGAATATG ACTATGTCAT ATTCACACAA ACTACTGAAA CAGCACACTC TTGTAATGTC AACCGCTTCA 
17790 17800 17810 17820 17830 17840 17850 

>< XhoII 
>< Sau3AI 
>< Ndell 
x Mfll 
X Mbol 
><- Maral 
X DpnII 

X BstYI 
X BspAI • 
X Bspl43I 
x BsiBI 
X BsaBI 
X BspWI >< Bglll 

ATGTGGCTAT CACAAGGGCA AAAATTGGCA TTTTGTGCAT AATGTCTGAT AGAGATCTTT ATGACAAACT 
17860 17870 17880 17890 17900 17910 17920 

>< Xbal 

X Rmal .,«. MaeIII 

x Mael x Maell BsrI >< 

GCAATTTACA AGTCTAGAAA TACCACGTCG CAATGTGGCT ACATTACAAG CAGAAAATGT AACTGGACTT 
17930 17940 17950 17960 17970 17980 17990 

X Sau3AI 
X Ndell 

X MboII 
X Mbol 

> < Fokl 

>< DpnII >< NlalV 

x Dpnl >< Eco64I 

X BspAI >< BscBI 

>< Tru « >< Bspl43I >< Banl Mnll X 

X Mselx Sfcl x Bbsl > < BsrI x AccBlI x Odd 
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. TTTAAGGACT GTAGTAAGAT CATTACTGGT CTTCATCCTA CACAGGCACC TACACACCTC AGCGTTGATA 
18000 18010 18020 18030 18040 180S0 18060 

>< ScrFI 
X Mval 
X EcoRII 
X Eco57I 

>< EC1136I 
>< DsaV 

>< BstOI >< piel 

X BstNI >< Nlalll 

X HindllX BsiLI Hinfl X 

X HincIIX Apyl A ccl X 

TAAAGTTCAA GACTGAAGGA TTATGTGTTG ACATACCAGG CATACCAAAG GACATGACCT ACCGTAGACT 
18070 18080 18090 18100 . 18110 18120 18130 

X Maelll Thai X 

X Eco0651 Mvnl X 

X Eco91I BstUI X 

x BstXI Bsp50I X 

X BstPI >< Aeil 

X BstEII >< HphI AccII >< 
CATCTCTATG ATGGGTTTCA AAATGAATTA CCAAGTCAAT GGTTACCCTA ATATGTTTAT CACCCGCGAA 

18140 181S0 18160 18170 18130 18190 18200 

X XranI 

> < MboII ><; sfaNI 

> < MaeI11 X Rmal 
X AS P 700I >K NlaIII 

x Alul x Maell >< Mnll >< M ael 

GAAGCTATTC GTCACGTTCG TGCGTGGATT GGCTTTGATG TflGAGGGCTG TCATGCAACT AGAGATGCTG 
18210 18220 18230 18240 18250 182S0 18270 

X Tru9I 
X Msel 

>< R sal X Hpal 

>< Gsul >< RmaT >< Hindll >< Rsal 

>< Csp6I >< Mnll >< Hindi x Csp6I 

><: B P raI >< Mael >< Ddel X Alul BsrI X 

x Afal x Alul x Sfcl x Bfrl X Afar 

TGGGTACTAA CCTACCTCTC CAGCTAGGAT TTTCTACAGG TGTTAACTTA GTAGCTGTAC CGACTGGTTA 
18280 18290 18300 18310 18320 18330 18340 

X ScrFI 
X Mval 
X Mnll 
>< Maelll 
>< EcoRII 

X Eco0651 
X EcoNI 

X Eco91I 
x EC1136I 
x DsaV Tru9l x 
X Dralll 
X BstPI 
X BstOI 

X BstNI Prael X 
x BstEII 
>< Ball Msel x 

, . J >< BsiYI HphI x 

X Hindu x HphI x Tru9I >< BsiLI Oral x 

X Hindi x EcoRI X Msel >< Apyl X BsrI 
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TGTTGACACT GAAAATAACA CAGAATTCAC CAGAGTTAAT GCAAAACCTC CACCAGGTGA CCAGTTTAAA 
18350 18360 18370 18380 18390 18400 18410 

>< ScrFI 

>< Mval 
>< EcoRII 

>< Ecll36I 
>< DsaV 

>< BstOI 

>< BstNI >< Rsal 

X BsiLI DdelX 
>< BsaJI > < Tru9I>< Csp6I 

>< Nlalll x Apyi > < Msel >< Afal 

CATCTTATAC CACTCATGTA TAAAGGCTTG CCCTGGAATG TAGTGCGTAT TAAGATAGTA CAAATGCTCA 
18420 18430 18440 18450 18460 18470 18480 

>< Nlalll 
>< HinPlI 
X Tthllll >< Hin6I 

>< Hinfl > < Hhal 

>< Aspl x Plel > < cfol >< Alul 

GTGATACACT GAAAGGATTG TCAGACAGAG TCGTGTTCGT CCTTTGGGCG CATGGCTTTG AGCTTACATC 
18490 18500 18510 18520 18530 18540 18550 

X SinI 
x Sau96I 
x NspIV 

X NspHII 
X Eco47I 
X Cfrl3I 
X Seal >< BsiZI 

X Raal x Bmel8I 

>< Csp6I x Avail x Maell 

x Afal >< Asul x Afllll >< MaelllX Maell 

AATGAAGTAC TTTGTCAAGA TTGGACCTGA AAGAACGTGT TGTCTGTGTG ACAAACGTGC AACTTGCTTT 
18560 18570 18580 18590 1B600 18610 18620 

> < Tfil >< Tthllll 

> < Hinfl > < Aspl 

TCTACTTCAT CAGATACTTA TGCCTGCTGG AATCATTCTG TGGGTTTTGA CTATGTCTAT AACCCATTTA 
18630 18640 18650 18660 18670 18680 18690 

>< ScrFI 
Rsal X 
X Mval 
X EcoRII 
EC1136I x 

x DsaV 
Csp6I X 
BstXI X 

> < Maelll >< BstOI 

> < Ecc-0651 >< BstNI 

> < Eco91I >< BsiLI 

> < Bat PI >< Apyi 
x Eco57I> < BstEII X Maelll x Nlalll Afal >< 

TGATTGATGT TCAGCAGTGG GGCTTTACGG GTAACCTTCA GAGTAACCAT GACCAACATT GCCAGGTACA 
18700 18710 18720 18730 18740 18750 18760 



FIGURE 13.43 
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>< Nlalll x Rmal 

>< Mael X Nlalll Tru9I >< 

x Nlalll >< Bspwi >< Mael >< Nlalll * 

> < Afllll >< BspHI Msel >< 

TGGAAATGCA CATGTGGCTA GTTGTGATGC TATCATGACT AGATGTTTAG CAGTCCATGA GTGCTTTGTT 
18770 18780 18790 18800 18810 18820 18830 

>< Thai 
>< Mvnl 
. X HinPlI 
>< flin6I 
X Hhal 
X Cfol 

X BstUI >< EcoNI> < Mnll 

X Bsp50I >< BslI >< Tru9I 

x AccII >< BsiYI x Ddel x Msel 

AAGCGCGTTG ATTGGTCTGT TGAATACCCT ATTATAGGAG ATGAACTGAG GGTTAATTCT GCTTGCAGAA 
18840 18850 18860 18870 18880 18890 18900 

x Rsal 

x Csp6I >< Mboll > < Nlalll 

X Afal X Nlalll >< BspWI x Bsrl x BspHI 

AAGTACAACA CATGGTTGTG AAGTCTGCAT TGCTTGCTGA TAAGTTTCCA GTTCTTCATG ACATTGGAAA 

18910 18920 18930 18940 18950 18960 18970 

x Saul 
>< Mstll 
x Eco81I 

>< Ddel Nlalll X 

x Cvnl X Espl 

>< Bsu36I >< Eco57I Maelll X 

x Bse2H X Ddel 

X Axyr x Cell! 

>< AocI X Mnll x SfaNI X Bpull02I 

TCCAAAGGCT ATCAAGTGTG TGCCTCAGGC TGAAGTAGAA TGGAAGTTCT ACGATGCTCA GCCATGTAGT 

18980 18990 19000 19010 19020 19030 19040 

>< Mnll x Ksp632I 

>< Hindlll x Earl 

x Alul x Mboll x Eamll04I 

GACAAAGCTT ACAAAATAGA GGAACTCTTC TATTCTTATG CTACACATCA CGATAAATTC ACTGATGGTG 
19050 19060 19070 19080 19090 19100 19110 

x Sau3AI 
x Ndell 
X Mbol 
x MaeII> < Maelll 
X DpnII 
X Dpnl 

X BspAI Hinfl > 

x Maelll x Bspl43l x Muni DrdI x 

TTTGTTTGTT TTGGAATTGT AACGTTGATC GTTACCCAGC CAATGCAATT GTGTGTAGGT TTGACACAAG 
19120 19130 19140 19150 19160 19170 19180 

Zsp2I X 
>< SphI 
> < PpulOI 
x Pael 
>< Nspl 

x ScrFI X NspHI 

x Mval >< Nlalll 

x EcoRII Mphll03I x 
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>< EC1136I >< Gsul 

X DsaV EcoT22I X 

>< BSCOI >< Bsml 

>< BstNI x BscCI 

>< BsiLI >< Bpml >< Nsil 

>< PI el x Apyl >< Aval I I 
AGTCTTGTCA AACTTGAACT TACCAGGCTG TGATGGTGGT AGTTTGTATG TGAATAAGCA TGCATTCCAC 

19190 19200 19210 19220 19230 19240 19250 

>< Tru9I 

> < Muni 

>< TthHB8i >< Msel 

X Bcgl/a >< TaqI >< Dral 

>< Alul >< Bcgl 

ACTCCAGCTT TCGATAAAAG TGCATTTACT AATTTAAAGC AATTGCCTTT CTTTTACTAT TCTGATAGTC 
19260 19270 19280 19290 19300 19310 19320 



>< Hinfix Alw26I Afllll >< 

CTTGTGAGTC TCATGGCAAA CAAGTAGTGT CGGATATTGA TTATGTTCCA CTCAAATCTG CTACGTGTAT 
19330 19340 19350 19360 19370 19380 19390 

Zsp2l > 
X Seal 

PpulOI >< 
>< RsalNsil > 
Mphll03I > 
>< SfaNIEcoT22I > 
> < Rsal >< Csp6I 
>< Csp6I Avail! >< 

>< NlaIII> < Afal x Afar 

TACACGATGC AATTTAGGTG GTGCTGTTTG CAGACACCAT GCAAATGAGT ACCGACAGTA CTTGGATGCA 
19400 19410 19420 19430 19440 19450 19460 

X Fokl 

TATAATATGA TGATTTCTGC TGGATTTAGC CTATGGATTT ACAAACAATT TGATACTTAT AACCTGTGGA 
19470 19480 19490 19SO0 19510 19520 19530 

>< ScrFI 

>< Hval 
X Maelll 
>< EcoRIX 

>< EC1136I 
>< DsaV 

X BstOI 

X BstNI 

x BsiLI x Tru9I 

>< Apyl >< Msel 

ATACATTTAC CAGGTTACAG AGTTTAGAAA ATGTGGCTTA TAATGTTGTT AATAAAGGAC ACTTTGATGG 
19540 19550 19560 19570 19580 19590 19600 

X SgrAI 
>< Nael 

x Mspi > < Vspl . 

X Hpall > < Tru9I 

X Hapll > < Msel 

X CfrlOI > < Asnl 

X BspWI > < AscI 
ACACGCCGGC GAAGCACCTG TTTCCATCAT TAATAATGCT GTTTACACAA AGGTAGATGG TATTGATGTG 

19610 19620 19630 19640 19650 19660 19670 

FIGURE 13. 45 
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>< XhoII 
X Sau3AI 
>< Ndell 
>< Mfll 
>< Mbol 
>< Dpnll 

><: DpnI >< Maelll 

>< BstYI >K EspI 

X BspAI ><: DdeITru9I 

X Bspl43I >< Tru9I >< CelllMsei >< 

><: B 9 J " X Msel X Alul . >< Bpull02I 

GAGATCTTTG AAAATAAGAC AACACTTCCT GTTAATGTTG CATTTGAGCT TTGGGCTAAG CGTAACATTA 
19680 19690 19700 19710. 19720 19730 19740 

>< Fnu4HI 

X Tru9I >< EcoRV 

X BsrI >< Msel >< Bbvl >< Eco32I 

AACCAGTGCC AGAGATTAAG ATACTCAATA ATTTGGGTGT TGATATCGCT GCTAATACTG TAATCTGGGA 
19750 19760 19770 19780 19790 19800 19810 

>< Nspl 
X NspHI 
>< Nlalll 
>< Bsgl 

>< Afim 

CTACAAAAGA GAAGCCCCAG CACATGTATC TACAATAGGT GTCTGCACAA TGACTGACAT TGCCAAGAAA 
19820 19830 19840 19850 19860 19870 19880 

>< DdelX MboII >K AccI 

CCTACTGAGA GTGCTTGTTC TTCACTTACT GTCTTGTTTG ATGGTAGAGT GGAAGGACAG GTAGACCTTT 
19890 19900 19910 19920 19930 19940 19950 

SirtI x 
Sau96I x 

Nsprv x 

NspHII >< 
NlalV x 
Eco47I X 
Cfcl3I X 
x BslI 
BsiZI x 
X BsiYI 
BscBI X 
Bmel8I X 

>< Tru9I Avail X 

x Msel Asul x 

TTAGAAACGC CCGTAATGGT GTTTTAATAA CAGAAGGTTC AGTCAAAGGT CTAACACCTT CAAAGGGACC 
19960 19970 19980 19990 20000 20010 20020 

X Vspl 
X Tru9I 
x Plel 

>< Rmal >< Msel i ru 9i >< 

x Nhel x Maelll >< Tru9I 

>< Mael x Asnl x Tfil M sel x 

x HgalX Alul x Hinflx Asel x Hinfl x Msel 
AGCACAAGCT AGCGTCAATG GAGTCACATT AATTGGAGAA TCAGTAAAAA CACAGTTTAA CTACTTTAAG 

20030 20040 20050 20060 20070 20080 20090 
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>< Accl >< Alw26I >< BfrlMsel X 

AAAGTAGACG GCATTATTCA ACAGTTGCCT GAAACCTACT TTACTCAGAG CAGAGACTTA GAGGATTTTA 
20100 20110 20120 20130 20140 20150 20160 

>< TthHB8I 
>< TaqI 

X SatI 

x Sdul 

x Sacl 

> < PaeR7I 

> < NspIII 

X NspII 
X HgiAI 

> < Eco88I 
X Xcral > < XhoIX Eco24I 

X Sau3AI X EC1136II 

X Ndell > < SlalX Bspl28SI 

> < Ccrix Bmyl 

> < BcoIX Banll 



X Dpnll 

X Dpnl 
»< BspAI 

X Bspl43I 



Xhol X 
TthHBBI > 
TaqI > 
Slal x 
PaeR7I X 
NspIII x 
x Mnll 
Eco88I X 
Ccrl X 
BspWI X 

Bcol X 
> < Bcgl/a 
Aval X 
Aroa87I X 
X EcoRI . X FoklAluI X 



AGCCCAGATC ACAAATGGAA ACTGACTTTC TCGAGCTCGC TATGGATGAA TTCATACAGC GATATAAGCT 



20220 



20230 



20180 

X TthHB8I 
X TaqI 
>< Sful 
X NspV 
X Lspl 
>< Csp45I 
X BstBI 
X BspU9I 

>< BsiCl X MboII 

X Bpul4I X Bbsl Tru9I X 

x AsuII x Bcgl x Nlalll x AcilMsel x 

CGAGGGCTAT GCCTTCGAAC ACATCGTTTA TGGAGATTTC AGTCATGGAC AACTTGGCGG TCTTCATTTA 
20240 20250 20260 20270 20280 20290 20300 

x HphI 

x HinPlI 

x Hin6I 
X Espl > < Hhal X Tfil 

x Ddel >< HaeH 

X Celll X Eco47III X Tru9I 

x Bpull02I > < Cfol X Hinfl X Msel 
X Bfrl X Bspl4 3II X Mnll 

ATGATAGGCT TAGCCAAGCG CTCACAAGAT TCACCACTTA AATTAGAGGA TTTTATCCCT ATGGACAGCA 



20310 



20320 



20330 2034 0 

>< MstI 
x HinPlI 
X Hin6I 
X Hhal 
X Fspl 
x Fdill 
X Cfol 
X Avill 



20350 



20360 



20370 



Dpnll X 

Dpnl x 
BspAI X 
Bspl43I : 



FIGURE 13.47 
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>< TthUlI 
>< TaqI 

>< Aspl >. < Maelll Maelll X 

TTTACTTGAT GACTTTGTCG AGATAATAAA GTCACAAGAT TTGTCAGTGA TTTCAAAAGT GGTCAAGGTT 
20450 20460 20470 20480 20490 20500 20510 

>< Nspl 
>< NspHI 

>< main 

X Fokl 

>< Muni > < Nlalll X Afllll 

ACAATTGACT ATGCTGAAAT TTCATTCATG CTTTGGTGTA AGGATGGACA TGTTGAAACC TTCTACCCAA 
2052O 20530 20540 20550 20560 20570 205BO 

X SfaNI 

X ScrFI 

X Mval 
>< EcoRII 

X EC1136I 
X DsaV 

X BstOI X SfaNI 

X BstNI X Rsal BspWI X 

X BsiLI > < Csp6l BsmI > 

x BspWI x Apyl x Afal BscCI X 

AACTACAAGC AAGTCAAGCG TGGCAACCAG GTGTTGCGAT GCCTAACTTG TACAAGATGC AAAGAATGCT 
20590 20600 20610 20620 20630 20640 20650 

x Eco571 X Maelll x HphI 

TCTTGAAAAG TGTGACCTTC AGAATTATGG TGAAAATGCT GTTATACCAA AAGGAATAAT GATGAATGTC 
20660 20670 20680 20690 20700 20710 20720 

> < Rsal 
X Csp6I 

>< BstU07I x Tru9I X Alul 

>< AccI x Msel > < AfalNlalll x 

GCAAAGTATA CTCAACTGTG TCAATACTTA AATACACTTA CTTTAGCTGT ACCCTACAAC ATGAGAGTTA 
20730 20740 20750 20760 20770 20780 20790 

X ScrFI 

x Rsal 
x Mval 
X EcoRII X HspBII 

X EC1136I X Sdul 

> < Csp6I >< NspII 

>< BstOI X PvuIIX HgiAI 
X BstNI X Ddel 

X BsiLI X Psp5IX Bspl286I 
>< Apyl >< Alul X Bmyl 
X DsaVx Afal X Alw21I 

TTCACTTTGG TGCTGGCTCT GATAAAGGAG TTGCACCAGG TACAGCTGTG CTCAGACAAT GGTTGCCAAC 
20800 20810 20820 20830 20840 20850 20860 

x Xholl 

X Tru9I 
X Sau3AI 
x Ndell 
x TthHBBI x Msel 
X Hfll 
X Mbol 
x MamI 
>< DpnII 
x Tfil x Dpnl 



FIGURE 13. 48 
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X BstYI > < Tfil 

>< BspAI > < Hinfl 

>< HinfIX Bspl43I X Esp3I >< Tru9l 

>< BsiBI >< Tthllll X BsmBI >< Msel 

X BsaBI >< BsmAI > < BsmAI 

C Hgal> < Alw26I 

rrACTTT aattggag/ 

20870 20880 20890 20900 20910 20920 20930 

X Styl 

X SinI 
x Sau96I 

> < SinI x Bmal 

> < SauS6I >< NspIV 
x PssI NspHII X 

x PspSII x Hael 

> < PpuMI x EcoT141 

> < NspIV X Eco47I 
X NspHII X ECOI30I 
x Nlaiv x Cfrl3l 

> < EcoO109I X BssTlI 

> < Eco47I X BsiZI 

> < Drall >< BsaJI 

> < CfrI3I X Bmel8I 

> < BsiZI X Blnl 
x BscBI X Avrll 

X Rsal > < Bmel8I X Avail 

> < Csp6I > < Avail x Asul 

x Afal > < Asul AflHI x 

TGTGCAACAG TACATACGGC TARTAAATGG GACCTTATTA TTAGCGATAT GTATGACCCT AGGACCAAAC 
20940 20950 2096O 20970 20980 20990 21O0O 

>< Nspl 
>< NspHI 

x Nlalll x Plel Rmal x 

x Maelll x Hinfl Mael x 

ATGTGACAAA AGAGAATGAC TCTAAAGAAG GGTTTTTCAC TTATCTGTGT GGATTTATAA AGCAAAAACT 
21010 21020 21030 21040 21050 21060 21070 

X ScrFI 
>< Mval 
x EcoRII 

x EC1136I 
x DsaV 

x BatOI Sau96I > 

X BstNI NspIV > 

X BsiLI Cfrl3I > 

x BsaJI BsiZI > 

x BsaJI x Sfcl x Bsral x Bstal Asul > 

X Apyl > < Alul X BscCI >< BscCIHindlll XX Alul 

AGCCCTGGGT GGTTCTATAG CTGTAAAGAT AACAGAGCAT TCTTGGAATG CTGACCTTTA CAAGCTTATG 
21080 21090 21100 21110 21120 21130 21140 

X Zsp2I 
x PpulOI 

x Pall x Nsil 

x Haelll X Mphll03I Tru9I x 

X BsuRI X Maelll X EcoT22I X Msel 

x BshI >< Nlalllx Alul x Bcgl x Avalll >< SfaNIBcgl/a x 

GGCCATTTCT CATGGTGGAC AGCTTTTGTT ACAAATGTAA ATGCATCATC ATCGGAAGCA TTTTTAATTG 
21150 21160 21170 21180 21190 21200 21210 



FIGURE 13.49 
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>< Zsp2I 
X SphI 
>< PpulOI 

>< Pael 
>< Nspl 
>< NspRI 
X Nsil 
x Nlalll 

> < Nlalll 

x Mphll03I 
x EcoT22I 

> <. Avalll x Mnll 
GGGCTAACTA TCTTGGCAAG CCGAAGGAAC AAATTGATGG CTATACCATG CATGCTAACT ACATTTTCTG 

21220 21230 21240 21250 21260 21270 21280 

Tru9I x 

x MboII >< Tru9I 

x Gsul Msei x 

>< BsrI >< Ms ei 

x Bpml Mnll x 

>< Bbsl >< Nlalll >< Mnll 

GAGGAACACA AATCCTATCC AGTTGTCTTC CTATTCACTC TTTGACATGA GCAAATTTCC TCTTAAATTA 
21290 21300 21310 21320 21330 21340 21350 

X Tru9I 
X Msei 
X Esp4I> < Tfil 
X BsmAI Ksp632I X 

x Alw26I x MboII >< Earl 

X AflII> < Hinfl Eamll04I X 

AGAGGAACTG CTGTAATGTC TCTTAAGGAG AATCAAATCA ATGATATGAT TTATTCTCTT CTGGAAAAAG 
21360 21370 21380 21390 21400 21410 21420 

>< Tru9I 
X Msei 
x Hindll 
x Hindi 
X Hpal Afllll > 

GTAGGCTTAT CATTAGAGAA AACAACAGAG TTGTGGTTTC AAGTGATATT CTTGTTAACA ACTAAACGAA 
21430 21440 21450 21460 21470 21480 21490 

x Vnel 
X Snol 

X Sdul 
X NspII 
>< Hpall 

X HgiAI 
x HapII 
X CfrlOl 

X Bspl286I 
x Msplx Brayl 

X Nspl x Spel x ApaLI 

X NspHI >< Rmal >< Alw44I 

X Nlalll x Mael >< Maelll X Agel x Alw21I 

CATGTTTATT TTCTTATTAT TTCTTACTCT CACTAGTGGT AGTGACCTTG ACCGGTGCAC CACTTTTGAT 
21500 21510 21520 21530 21540 21550 21560 

> < Alul x Mnll 

GATGTTCAAG CTCCTAATTA CACTCAACAT ACTTCATCTA TGAGGGGGGT TTACTATCCT GATGAAATTT 
21570 21580 21S90 21600 21610 2i620 21630 



FIGURE 13. 50 
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>< Ndell 
>< Mbol 
>< DpnII 

X Dpnl >< Tru9I 

>< BspAI >< Msel > < MboII 

>< BspH3l >< Ddel >< Haelll 

TTAGATCAGA CACTCTTTAT TTAACTCAGG ATTTATTTCT TCCATTTTAT TCTAATGTTA CAGGGTTTCA 
21640 21650 21660 21670 21680 21690 21700 

>< Vspl 
>< Tcu9I 
>< Msel • 

>< Asnl >< Tru9I x Fokl 

>< Asel x Maell x Msel x Bbvl > < Fnu4HI . 

TACTATTAAT CATACGTTTG GCAACCCTGT CATACCTTTT AAGGATGGTA TTTATTTTGC TGCCACAGAG 
21710 21720 21730 21740 21750 21760 21770 

x BslI 

x Osalx BsiYi x Nlalll 

x BsaJl > < Maelll 

AAATCAAATG TTGTCCGTGG TTGGGTTTTT GGTTCTACCA TGAACAACAA GTCACAGTCG GTGATTATTA 
21780 21790 21800 21810 21820 21830 21840 

X Nspl 

X Tru9I >< NspHI 

X Msel >< Nlalll 

>< HphI >< Maelll x Maelll 

TTAACAATTC TACTAATGTT GTTATACGAG CATGTAACTT TGAATTGTGT GACAACCCTT TCTTTGCTGT 
21850 21860 21870 21880 21890 21900 21910 

>< Styl >< Zsp 2i 

X Nlalll >< Tru9I 

X Ncol X Rsal x PpulOI TthHS8I X 

X ECOT14I >< Hsil X TaqI 

X Eool30I >< Msel SfaNl x 

>< DsalX Csp6I >< Mphll03r Rsal >< 

X BssTlI x TthHBBI >< EcoT22I Csp6I X 

>< BsaJIx Afal >< TaqI X Avalll Afal X 

TTCTAAACCC ATGGGTACAC AGACACATAC TATGATATTC GATAATGCAT TTAATTGCAC TTTCGAGTAC 

21920 21930 21940 21950 21960 21970 21980 

X Msel 
>< Oral 

ATATCTGATG CCTTTTCGCT TGATGTTTCA GAAAAGTCAG GTAATTTTAA ACACTTACGA GAGTTTGTGT 
21990 22000 22010 .22020 22030 2204.0 22050 

X Sau3AI 
X Ndell 
x Mbol 
X Dpnir 
x Dpnl 

x Msel >< B S p A i 

. >< D"I >< sfcl Bspl43I x 

TTAAAAATAA AGATGGGTTT CTCTATGTTT ATAAGGGCTA TCAACCTATA GATGTAGTTC GTGATCTACC 
2206O 22070 22080 . 22090 22100 22110 22120 

X Tru9I 

>< Tru9I > < Tru9I x Maal 

>< Msel > < Msel x Mnll 

TTCTGGTTTT AACACTTTGA AACCTATTTT TAAGTTGCCT CTTGGTATTA ACATTACAAA TTTTAGAGCC 
22130 22140 22150 22160 22170 22180 22190 

FIGURE 13.51 



X TruSI 
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> < SduIX Sfcl 

>< PvuII 
>< Psp5I 

> < NspII 

>< NspBII 

> < Maell > < Pnu4HI 

> < Bspl286I >< PstI Tru9I > 
>< BspMI > < BraylX FiuMHI Msel > 

>< HphI X Bbvl X Alul x Bbvl 

ATTCTTACAG CCTTTTCACC TGCTCAAGAC ATTTGGGGCA CGTCAGCTGC AGCCTATTTT GTTGGCTATT 
22200 222X0 22220 22230 22240 22250 22260 

X SfaNI 
X Rsal 

> < Csp€I 

x Dral >< Afal >< AlwNl 

TAAAGCCAAC TACATTTATG CTCAAGTATG ATGAAAATGG TACAATCACA GATGCTGTTG ATTGTTCTCA 
22270 22280 22290 22300 22310 22320 22330 

> < Tru9I 

> < Msel 

x Alul 

AAATCCACTT GCTGAACTCA AATGCTCTGT TAAGAGCTTT GAGATTGACA AAGGAATTTA CCAGACCTCT 
22340 22350 22360 22370 22380 22390 22400 

x Saul 
X Mstll 
x EcoSlI 
X Ddel 
>< Cvnl 
X Bsu36I 
X Bse21I 

X Axyl X Tfil 

x Mnll X AocI x Mnll x Hinfl >< Sspl x Mnll 

AATTTCAGGG TTGTTCCCTC AGGAGATGTT GTGAGATTCC CTAATATTAC AAACTTGTGT CCTTTTGGAG 
22410 22420 22430 22440 22450 22460 22470 

X Zsp2I 
x PpulOI 
x Nsil 

> < Nlalll 
X Mphll03I 

x Tru9I X ECOT22I 

>< Msel X Aval II 

AGGTTTTTAA TGCTACTAAA TTCCCTTCTG TCTATGCATG GGAGAGAAAA AAAATTTCTA ATTGTGTTGC 
22480 22490 22500 22510 22520 22530 22540 

X Sdul 
X NspII 
X HgiAI 
X Bspl286I 

X Bmyl X Tru9I 

x Alw2ii x Msel Ddel x 

TGATTACTCT GTGCTCTACA ACTCAACATT TTTTTCAACC TTTAAGTGCT ATGGCGTTTC TGCCACTAAG 
22550 22560 22570 22580 22590 22600 22610 

>< Sau3AI 
>< Ndell 
>< Mbol 
>< DpnII 
x Dpnl 



FIGURE 1352 
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>< BspAI x Tfil 

>< Bspl43I x Hinfl 

TTGAATGATC TTTGCTTCTC CAATGTCTAT GCAGATTCTT 7TGTAGTCAA GGGAGATGAT GTAAGACAAA 
22620 22630 22640 22650 22660 22670 22680 

x ScrFI 
>< Mval 
>< HinPlI 
X Hin6I 
.X Hhal 

X Haell 
X EcoRII 

X Ecll36I 
>< DsaV 
x cfor 

>< BstOI 
>< BstNI 
X Bspl43II 
X BsiLI 

x Apyl > < Bar I >< m aIII 

TAGCGCCAGG ACAAACTGGT GTTATTGCTG ATTATAATTA TAAATTGCCA GATGATTTCA TGGGTTGTGT 
22690 22700 22710 22720 22730 22740 22750 

x sfam 

X Rmal Odel x. 

>< Mael >< BsrI Bfrl X 

CCTTGCTTGG AATACTAGGA ACATTGATGC TACTTCAACT GGTAATTATA ATTATAAATA TAGGTATCTT 

22760 22770 22780 22790 22B0O 22810 22820 

X Sail 9 61 

X Pali 
x NspIV 
> < Hindi II 

>< Haelll 

>< Eco0109I 
x Drall 
x Ddel 

x Cfrt3I 
X BsuRI 
>< BsiZI 
X Bshl 
x Bfrl X PssI 
X Nlalll >< AauIX BsmAI 

x Alul X Alw26I BspWI >< 

AGACATGGCA AGCTTAGGCC CTTTGAGAGA GACATATCTA ATGTGCCTTT CTCCCCTGAT GGCAAACCTT 
22830 22840 22850 22860 22870 22880 22890 

X Tru9I 
x Pall 
x MscI 
x Haelll 
X EaelX Use I 
X Tru9l x BsuRI 

X Msel x Bshl 

x BspMI >< Ball BsrI x 

GCACCCCACC TGCTCTTAAT TGTTATTGGC CATTAAATGA TTATGGTTTT TACACCACTA CTGGCATTGG 
22900 22910 22920 22930 22940 22950 22960 

Sau96I X 
X PallNspIV X 
> < Mspl NspHII X 
x Haelll 
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> < Hpall Eco47I X 

X Dsal 

> < Hapll Cfrl3I X 

>< BsuRISinI >< 
>< Gdill BsiZI X 
>< Seal x BsaJI 

x Rsal x Tru9l x Eael Bmel8I x 

x Csp6I x Msel X CfrlOI Avail x 

X Afal x Oral x Bshl Asul X 

CTACCMCCT TACAGAGTTG TAGTACTTTC TTTTGAACTT TTAAATGCAC CGGCCACGGT TTGTGGACCA 
22970 22980 22990 23000 23010 23020 23030 

X Tru9I X Rsal 

X Tru9I x Csp6I 

>< Plel BsrI x 

> < Tru9I x Msel X BsrI 

> < Mseix BsrI x Msel x Hinfl >< Afal 
AAATTATCCA CTGACCTTAT TAAGAACCAG TGTGTCAATT TTAATTTTAA TGGACTCACT GGTACTGGTG 

23040 23050 23060 23070 23080 23090 23100 

X Tru9I x Pall 

X Msel X Haelll 

>< MboII x Gdill 

X Hpal >< Gael 

X Hindll X BsuRI Tfil X 

>< Hindi x Bshl Hinfl x 
TGTTAACTCC TTCTTCAAAG AGATTTCAAC CATTTCAACA ATTTGGCCGT GATGTTTCTG ATTTCACTGA 

23110 23120 23130 23140 23150 23160 23170 

> < XhoII 
X TthHB8I 
X Taql 

> < Sau3AI 

> < NdeH 

> < Mfll 

> < Mbol 

> < Dpnll 

X Dpnl 

> < BstYI 

> < Sspl 
>< HphI 

TTCCGTTCGA GATCCTAAAA CATCTGAAAT ATTAGACATT TCACCTTGCT CTTTTGGGGG TGTAAGTGTA 
23180 23190 23200 23210 23220 23230 23240 

X ScrFI 
x Mval 
X EcoRII 

x Ecll36I x Tru9I 

x DsaV x Msel 

X SstOI X Hpal 

>< BstNI X Hindll 

X BsiLI X Eco57I 

x Apyl >c Bsgl x Hindi 

ATTACACCTG ' GAACAAATGC TTCATCTGAA GTTGCTGTTC TATATCAAGA TGTTAACTGC ACTGATGTTT 
23250 • 23260 23270 23280 23290 23300 23310 

x Sau3AI 
X Nlalll 
X Ndell 
X Mbol 
X Dpnll 

X Dpnl x HinPil 

FIGURE 13. 54 
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>< BspHl >< Hin6I 

>< BspAI > < Hhal Plel >< 

>< Sfcl >< Bspl43I >< Alul> < Cfol >< BsrI 

CTACAGCAAT TCATGCAGAT CAACTCACAC CAGCTTGGCG CATATATTCT ACTGGAAACA ATGTATTCCA 
23320 23330 23340 233S0 23360 23370 23380 

>< TthHB8I 
>< TaqI 
>< Sail 
>< Rtrl 
>< Nspl 
>< Espl >< NspHI 
>< Ddel >< Nlalll 
>< Cell! >< Hindll 
X Bpull02I>< Hindi 
>< Hinfl >< AlttI X ACCl 

GACTCAAGCA GGCTGTCTTA TAGGAGCTGA GCATGTCGAC ACTTCTTATG AGTGCGACAT TCCTATTGGA 
23390 23400 23410 23420 23430 23440 23450 

> < SnaBI 

x Seal 
x Rsal 
X Rmal 
x Maeil x Mael 

> < Ecol05I 
x Rmal X Csp6I 

X Maelll > < BsaAI 

X Alul x Mael x Afal 

GCTGGCATTT GTGCTAGTTA CCATACAGTT TCTTTATTAC GTAGTACTAG CCAAAAATCT ATTGTGGCTT 
23460 23470 23480 23490 23500 23510 23S20 

>< Muni 

ATACTATGTC TTTAGGTGCT GATAGTTCAA TTGCTTACTC TAATAACACC ATTGCTATAC CTACTAACTT 
23530 23540 23550 23560 23570 23580 23590 

Rsal X 
x Mnll 

Csp6I X 

>< Sfcl Afal x 

TTCAATTAGC ATTACTACAG AAGTAATGCC TGTTTCTATG GCTAAAACCT CCGTAGATTG TAATATGTAC 
23600 23610 23620 23630 23640 23650 23660 

> < Tfil 

> < Hinfl 

X Acil > < Alul 

ATCTGCGGAG ATTCTACTGA ATGTGCTAAT TTGCTTCTCC AATATGGTAG CTTTTGCACA CAACTAAATC 
23670 23680 23690 23700 23710 23720 23730 

>< Vnel 

X Sdul 
X Nspl I 

x HgiAI x Pair 

x Snoix Ddel x Sau3AI X PmaCI 

X Bspl286I x Ndell x Maeil 

x Bmyl x Mbol X Eco72I 

x Bbvl x Dpnl x BsaAI 

X ApaLI X Bspl43I x BbrPI 

X Alw44I >< Opnll X Alwl 

X Alw211 X Fnu4HI X BspAI X AflHI 
GTGCACTCTC AGGTATTGCT GCTGAACAGG ATCGCAACAC ACGTGAAGTG TTCGCTCAAG TCAAACAAAT 

23740 23750 23760 23770 23780 23790 23800 



FIGURE 13.55 
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>< Rsal 

X Csp6I >< Tcvt9 x 

>< Afal >< sspl >< Msel x sspl 

GTACAAAACC CCAACTTTGA AATATTTTGG TGGTTTTAAT TTTTCACAAA TATTACCTGA CCCTCTAAAG 
23810 23820 23830 23840 23850 23860 23870 

>< Mnir 

>< T >< Tru9I >< SfaMI >< H phl Nlalll X 

x Ddel x mil >< HseI >< Maelll BspHI x 

CCAACTAAGA GGTCTTTTAT TGAGGACTTG CTCTTTAATA AGGTGACACT CGCTGATGCT GGCTTCATGA 

23880 23890 23900 23910 23920 23930 23940 





x XhoII 






X Sau3AI 




X Styl 


X Rmal 




x Rmal 


x Hdell 




X Mael 


X Mfll 




>< EcoT14I 


X Mbol 


X MstI 


X Ecol301 


x Mael 


X HinPlI 


x BssTlI 


>< Vspl x DpnII 


x Hin6I 


>< BsmI 


x Hphl> < Dpnl 


x Hhal 


BscCI 


>< Tru9l x BstYI 


X Fspl 


X BsaJI : 


>< Msel x BspAI 


x Fdill 


X Blnl : 


>< Asnl > < Bspl43I 


x CEoI 


>< Avrll ; 


>< Asel X Bglll 


X AvlII 



23960 23970 23980 23990 24000 



24010 

X RmalRsal x 
X Mnll >< Fnu4HI X Fnu4HI Csp6I >< 

x BspWI >< Bbvi >< BbvX x BspWI X MaelAfal x 

TACAGTGTTG CCACCTCTGC TCACTGATGA TATGATTGCT GCCTACACTG CTGCTCTAGT TAGTGGTACr 
24020 24030 24040 24050 24060 24070 24080 

X MboII 
X HinPlI 
x Hin6I 
X Hhal 
x Haell 

X Fnu4HI x Ksp632I 
x Cfol >< Earl 
>< Fokl X BspWI X Eamll04I 
X Bbvr >< Bspl43II 

GCCACTGCTG GATGGACATT TGGTGCTGGC GCTGCTCTTC AAATACCTTT TGCTATGCAA ATGGCATATA 
24090 24100 24110 24120 24130 24140 24150 

Tru9I >< 

x Maelll MseI >c 

GGTTCAATGG CATTGGAGTT ACCCAAAATG TTCTCTATGA GAACCAAAAA CAAATCGCCA ACCAATTTAA 
24160 24170 24180 24190 24200 24210 24220 

Maell x 

>K TfiI X Fnu4HI 

x Hinfl >< Bbvl x Alul 

CAAGGCGATT AGTCAAATTC AAGAATCACT TACAACAACA TCAACTGCAT TGGGCAAGCT GCAAGACGTT 
24230 24240 24250 24260 24270 24280 24290 

X Tru9I 
x Msel 

>< "Pal X Ddel 

X Hlndll x BsmI x Tru9I x Tru9I X Bfrl 

X Hincllx BscCI x Msel x Msel x Alul 

FIGURE 13. 56 
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GTTAACCAGA ATGCTCAAGC ATTAAACACA CTTGTTAAAC AACTTAGCTC TAATTTTGGT GCAATTTCAA 



>< Nrul 
>< Mvni 

>< BstUI >< TthKB8I 

>< Bsp68I >< TaqI >< Rsal 

X EcoRV >< Bsp50I >< Mnll >< Csp6I X Tru9I 

>< Eco32I >< AccII >< Mnll >< AcilX Afal X Msel 

GTGTGCTAAA TGATATCCTT TCGCGACTTG ATAAAGTCGA GGCGGAGGTA CAAATTGACA GGTTAATTAC 
24370 24380 24390 24400 24410 24420 24430 

X Maelll >< Bbvl X Fnu4KI Bbvl X 

AGGCAGACTT CAAAGCCTTC AAACCTATGT AACACAACAA CTAATCAGGG CTGCTGAAAT CAGGGCTTCT 
24440 24450 24460 24470 24480 24490 24500 

X Fnu4HI >< Hindll 

X BspWI X Ddel x Hindi 

GCTAATCTTG CTGCTACTAA AATGTCTGAG TGTGTTCTTG GACAATCAAA AAGAGTTGAC TTTTGTGGAA 



24560 



24570 



> < Nspl 

> < NspHl 

> < main 

x Maelll 
X Nlalll x Maell 

X Mboll >< Fokl 

X Fnu4HI X Bbsl BsaAI X 

X AcilX Bbvl X Afllll 

AGGGCTACCA CCTTATGTCC TTCCCACAAG CAGCCCCGCA TGGTGTTGTC TTCCTACATG TCACGTATGT 
24580 24590 24600 24610 24620 24630 24640 

X ScrFI 
x Mval 
X EcoRII 
X Eel 13 61 
X BstOI 
X BstNI 
. X Mnll X BslI 
X DsaVX BsiYI 

X BsiLI 
>< BsaJIX HphI 
X Apyl 



>< HinPlI 
>< Hin6I 
x Hhal 

x Haell 
x Cfol x Nlalll 

X Bspl4 3II X BspHI EcoNI X 

GCCATCCCAG GAGAGGAACT TCACCACAGC GCCAGCAATT TGTCATGAAG GCAAAGCATA CTTCCCTCGT 
24650 24660 24670 24680 24690 24700 24710 

x Mnll 
X BslI X Tru9I 

x BsiYI x Msel x Mnll 

GAAGGTGTTT TTGTGTTTAA TGGCACTTCT TGGTTTATTA CACAGAGGAA CTTCTTTTCT CCACAAATAA 
24720 24730 24740 24750 24760 24770 24780 

X Ddel x Tru9I 

X BsmAI >< SfaNI 

X Sfcl X Alw26I X MselAlwI X 

TTACTACAGA CAATACATTT GTCTCAGGAA ATTGTGATGT CGTTATTGGC ATCATTAACA ACACAGTTTA 
24790 24800 24810 24820 24830 24840 24850 



FIGURE 13.57 
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>< Mbol >< Plel > < Seal 

>< DpnII >< Mnll > < Ksp632I > < Rsal 

>< Dpnl x Ddel x Hinfl >< MboII 

>< BspAI >< BspWI > < Eamll04I >< Csp6I 

>< Bspl43I >< Alul > < Earl > < Alul > < Afal > < HphI 
TGATCCTCTG CAACCTGAGC TTGACTCATT CAAAGAAGAG CTGGACAAGT ACTTCAAAAA TCATACATCA 

24860 24870 24880 24890 2490O 24910 24920 

>< Sau3AI 
x Ndell 
X Mbol 
X Mam! 
>< DpnII 

x Dpnl 
X BspAI 

X Bapl43l 

X BsiBI x Tru9I x Hindll 

x BsaBI x Msel x Hindi Acil X 

CCAGATGTTG ATCTTGGCGA CATTTCAGGC ATTAACGCTT CTGTCGTCAA CATTCAAAAA GAAATTGAOC 
24930 24940 24950 24960 24970 24980 24990 

x Tru9I 

> < Tfil 
X Mnll X Swal 

x EcoNI x Msel 

>< BslI . > < Hinfl 

X Mnllx BsiVI x Dral 

GCCTCAATGA GGTCGCTAAA AATTTAAATG AATCACTCAT TCACCTTCAA GAATTGGGAA AATATGAGCA 
25000 25010 25020 25030 25040 25050 25060 

>< Styl 
X Pall 
x Haelll 

X EcoT14I 

X Ecol30I 
X BsuRI 

X BssTlI Nlalll X 

x Tru9I>< BshI ' Maelll x 

>< Msel x BsaJI >< BstXI 

ATATATTAAA TGGCCTTGGT ATGTTTGGCT CGGCTTCATT GCTGGACTAA TTGCCATCGT CATGGTTACA 
25070 25080 25090 25100 25110 25120 25130 

> < SphI 

> < Pael 

x Spel > < Nspl 

> < Rmal > < NspHI 
X Nlalll > < Nlalll 

> < Mael x Mnllx Bbvl Fnu4HI x 
ATCTTGCTTT GTTGCATGAC TAGTTGTTGC AGTTGCCTCA AGGGTGCATG CTCTTGTGGT TCTTGCTGCA 

25140 25150 25160 25170 25180 25190 25200 

X Fokl 
x Ddel 

x Mnll x Pleix Hinfl x BsrI 

AGTTTGATGA GGATGACTCT GAGCCAGTTC TCAAGGGTGT CAAATTACAT TACACATAAA CGAACTTATG 
25210 25220 25230 2S240 25250 25260 25270 

>< Sau3AI 
>< Ndell 
X Mbol 
x DpnII 
> < Dpnl 

FIGURE 13.58 
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>< BspAI 

> < Bspl43I 

>< Bsgl >< Alwl >< BscI BspWI > 

GATTTGTTTA TGAGATTTTT TACTCTTGGA TCAATTACTG CACAGCCAGT AAAAATTGAC AATGCTTCTC 
25230 25290 25300 25310 25320 25330 2S340 

>< Seal 
>< Rsal 
>< Csp6I >< Sfcl 
>< Afal >< Nlalll >< Acil X.Mnll Fokl > 

CTGCAAGTAC TGTTCATGCT ACAGCAACGA TACCGCTACA AGCCTCACTC CCTTTCGGAT GGCTTGTTAT 
25350 25360 25370 25380 25390 25400 25410 

> < HinPlI 

> < Hln6I 

>< Hhal Rmal >< 

>< Haell >< HinPlI Nhel >< 

x Eco47III >< Hin6I Mael >< 

>< Cfol >< Hhal Fnu4HI X 

X BspWI X Bspl43II X Cfol Alul X 
TGGCGTTGCA TTTCTTGCTG TTTTTCAGAG CGCTACCAAA ATAATTGCGC TCAATAAAAG ATGGCAGCTA 

25420 25430 25440 25450 25460 25470 25480 

X BcoNI 
x BslX 

X BslYI x Maelll 

X Bbvl X Bsrl X Bbvl > < Fnu4HI Bbvl X 

GCCCTTTATA AGGGCTTCCA GTTCATTTGC AATTTACTGC TGCTATTTGT TACCATCTAT TCACATCTTT 
25490 25500 25510 2S520 25530 25540 25550 

Zsp2I .X 
PpulOI X 

> < Sfcl x HinPlI Nsil x 

x PstI x Hin6I x Rsal Mphll03I x 

> < Fnu4HI x Hhal x Csp6I EcoT22I X 
x BspMI x Mnll x Cfol x Afal x Mnll Avalll x 



x SfaNI 
x Nspl 
X NspHI 

>< Nlalll x SfaNI 

CAACGCATGT AGAATTATTA TGAGATGTTG GCTTTGTTGG AAGTGCAAAT CCAAGAACCC ATTACTTTAT 
25630 - 25640 25650 25660 25670 25680 25690 

x Bstll07I 
X Accl Maelll X 
GATGCCAACT ACTTTGTTTG CTGGCACACA CATAACTATG ACTACTGTAT ACCATATAAC AGTGTCACAG 
25700 25710 25720 25730 25740 25750 25760 

X ttooll 

>< HphI BstXI X 

>< Muni x Maelll x Maelll x Eco57I x Bbsl Mnll > 

ATACAATTGT CGTTACTGAA GGTGACGGCA TTTCAACACC AAAACTCAAA GAAGACTACC AAATTGGTGG 
25770 25780 25790 25800 25810 25820 2S830 

>< Rsal 

> < Hlalll 
X HphI 
x Tru9I x Tthllllx Csp6I 
x Ddel x Ddel x Mselx Aspl >< Afal 



FIGURE 1359 
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TTATTCTGAG GATAGGCACT CAGGTGTTAA AGACTATGTC GTTGTACATG GCTATTTCAC CGAAGTTTAC 
25840 25850 25860 25870 25880 25890 25900 

Tru9I >< 

> < Hinflx Plel >< BsrI Msel >< 

X Alul >< AccI X Sfcl X AlwNI >< MboII Hindlll > 

TACCAGCTTG AGTCTACACA AATTACTACA GACACTGGTA TTGAAAATGC TACATTCTTC ATCTTTAACA 
25910 25920 25930 25940 25950 25960 25970 

> < TthHB8I 

X Tru9I > < Taqi X Ksp632I 

>< Msel > < MboII >< Earl BspWI >< 

>< Alul >< Eco57I >< Eamll04I Alwl >< 

AGCTTGTTAA AGACCCACCG AATGTGCAAA TACACACAAT CGACGGCTCT TCAGGAGTTG CTAATCCAGC 
25980 25990 26000 26010 26020 26030 26040 

X XhoII 
>< Sau3AI 

>< NlalV 
>< Udell 
x Mfll 
X Mbol 
x DpnII 

>< Dpnl 
x BstYI 
>< BstI 
x BspAI 

x Bspl43I RsaI >< 

X BscBI >< Rmal Csp6I X 

X BamHI x Alwl x Mael Afal x 

AATGGATCCA ATTTATGATG AGCCGACGAC GACTACTAGC GTGCCTTTGT AAGCACAAGA AAGTGAGTAC 
26050 26060 26070 26080 26090 26100 26110 

> < Tru9I 
x RsaI 

> < Msel 
x MboII 

> < RsaI x Maell >< RsaI 
X Csp6I x Csp6l x Tru9I x Cs P 6I 

> < Afal x Afal >< Mael x Afal 
GAACTTATGT ACTCATTCGT TTCGGAAGAA ACAGGTACGT TAATAGTTAA TAGCGTACTT CTTTTTCXTG 

26120 26130 26140 26150 26160 26170 26180 

X TthHBSI 
X Taqi 

x Rmal x HinPlI > < RsaI 

> < Maelll >< Hin6l Fnu4HI x 

x Mael x Rmal x Hhal x Csp6l 

>< Fokl x Mael x Cfol x Bbvl > < Afal 

CTTTCGTGGT ATTCTTGCTA GTCACACTAG CCATCCTTAC TGCGCTTCGA TTGTGTGCGT ACTGCTGCAA 
26190 26200 26210 26220 26230 26240 26250 

X Tru9I 

x Tru9I >< Thai 

>< Msel x Mvnl 

>< Sspl >< Maell x Msel 

>< Hpal x BstOI Ksp632I > 

X Hindll x Maell X Bsp50I X MboII Earl > 

X Hindi x AccI X AccII Eamll04I > 

TATTGTTAAC GTGAGTTTAG TAAAACCAAC GGTTTACGTC TACTCGCGTG TTAAAAATCT GAACTCTTCT 
26260 26270 26280 26290 26300 26310 26320 



FIGURE 13.60 
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>< Sau3AI 
>< Ndell 
>< Mbol 
X DpnII 
>< Mbolix Dpnl 

X Xninl x BspAI> < Eco57I X Tru9I 

X Asp700I>< Bspl43I X Msel 

GAAGGAGTTC CTGATCTTCT GGTCTAAACG AACTAACTAT TATTATTATT CTGTTTGGAA CTTTAACATT 
26330 26340 26350 26360 26370 26380 26390 

X SorFI 
>< Mval 
X EcoRII 

x Edl36I 
X DsaV NlalV x 
x Rsal x BstOI 

>< Mnll x Tru9I X BstNI Rmal X 

X Csp6I X Msel X BsiLI Mael X 

> < Nlalll X Afal > < Alul X ApylBscBI X 

GCTTATCATG GCAGACAACG GTACTATTAC CGTTGAGGAG CTTAAACAAC TCCTGGAACA ATGGAACCTA 
26400 26410 26420 26430 26440 26450 26460 

X ScrFI 
x Rmal 

x Mval 
X Mael 

x EcoRII 

X Ecll36I 
x DsaV 

X BstOI 

X BstNI 

X BsiLI . 

x Apyl X MaelXI 

GTAATAGGTT TCCTATTCCT AGCCTGGATT ATGTTACTAC AATTTGCCTA TTCTAATCGG AACAGGTTTT 
26470 26480 26490 26500 26510 26520 26530 

x Pall 
x MscI 
x Mnll x Maelli 
X Haelll 
x Eael 
X BsuRI 
>< BsrI 

>< Rsal x BspWI 

X Csp6I X Hindlll X BshI 

X Afal >< Alul x Ball x Bbvl fnu4HI X 

TGTACATAAT AAAGCTTGTT TTCCTCTGGC TCTTGTGGCC AGTAACACTT GCTTGTTTTG TGCTTGCTGC 
26540 26550 26560 26570 26580 26590 26600 

X Vspl 
x Tru9I 

x Msel x HphI 

x Sfcl x Asnl x BsrI 

>< Acol >< Aselx Maelllx Acil 
TGTCTACAGA ATTAATTGGG TGACTGGCGG GATTGCGATT GCAATGGCTT GTATTGTAGG CTTGATGTGG 
26610 26620 26630 26640 26650 26660 26670 

X Espl 

X Eco57I 
X Odel • 

X Celll x Rsal 

x Bpull02I x Csp6I 



FIGURE 13.61 
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X .Bfrl X A f a i 

>< Alul >< Acil MboII > 

. CTTAGCTACT TCGTTGCTTC CTTCAGGCTG TTTGCTCGTA CCCGCTCAAT GTGGTCATTC AACCCAGAAA 

26680 26690 26700 26710 26720 26730 26740 

>< Neil 
>< Mspl 
>< Hpall. 
>< HapII 
>< DsaVX Mnll 

>< Ball 

x BsiYI 

>< BsaJI >< Hunt > < xcml 

>< Bcnl >< Maelll >< Acil >< Mlalll 

CAAACATTCT TCTCAATGTG CCTCTCCGGG GGACAATTGT GACCAGACCG CTCATGGAAA GTGAACTTGT 
26750 26760 26770 26780 26790 26800 26810 

TruSI >< 

SinI > 
Sau961 > 
PpuMl > 
NsplV > 
Msel X 
>< Maelll 

x Sau3AI > < Rmal >< Haell 

>< Ndell x Pall > < Mael EcoO109I > 

>< Mbol >< Mspl X HinPlIEco47I > 

x .Fbal x Hpall x Stylx Hin6I Drall > 

>< Dpnll x HapII x EcoT14I Cfcl3I > 

>< Dpnl >< Haelll x Ecol30I>< Bspl43II 

X BspAI x Gdill x BssTlI BsiZI > 

x Bspl43I x Eael x BsaJI BmelSl > 

X BsiQI >< bsuRI x Blnl X Hhal Avail > 

X Bell X Maelll X BshI X Avrll X Cfol Asul > 

CATTGGTGCT GTGATCATTC GTGGTCACTT GCGAATGGCC GGACACTCCC TAGGGCGCTG TGACATTAAG 
26820 26830 26840 26850 26860 26870. 26880 

x Sau3AI 
x Ndell 
x Mbol 
>< Dpnll 
>< Dpnl 
x PssI x BspMI 
X Psp5II X BapAI >< XmnI 

X NspHII X Bspl43I X Asp700I > < Hgal Fnu4HI X 

GACCTGCCAA AAGAGATCAC TGTGGCTACA TCACGAACGC TTTCTTATTA CAAATTAGGA GCGTCGCAGC 
26890 26900 26910 26920 26930 26940 26950 

X Tfil 
>< Hinfl 



C Bbvl 



t Tcu9I 



x Bbvl >< Fnu4HI x Acil > < Msel 

GTGTAGGCAC TGATTCAGGT TTTGCTGCAT ACAACCGCTA CCGTATTGGA AACTATAAAT TAAATACAGA 

26960 26970 26980 26990 27000 27010 27020 

>< Mspl >< R sa i 

x Hpall x Rmal 

x HapII >< Csp6I 

>< CfrlOl x MaelX Bcgl Hindu x 

x Bcgl/a x sspl x Afai x Maelll Hindi >< 



FIGURE 13.62 
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CCACGCCGGT AGCAACGACA ATATTGCTTT GCTAGTACAG TAAGTGACAA CAGATGTTTC ATCTTGTTGA 
27030 27040 27050 27060 27070 27080 27090 

>< ScrFI 

X Mval 
>< Maelll 
X EcoRII 

>< Eel 1361 
>< DsaV 

X BstOI 

>< BstNI 

>< telw X Tfil 

>< Apyl X Mnll Hlnfl x. 

CTTCCAGGTT ACAATAGCAG AGATATTGAT TATCATTATG AGGACTTTCA GGATTGCTAT TTGGAATCTT 
27100 27110 27120 27130 27140 27150 27160 

>< BsmAI >< Tru9I > < Hnll 

X Maell >< Alw26I >< Msel >< Ddel >< MboII 

GACGTTATAA TAAGTTCAAT AGTGAGACAA TTATTTAAGC CTCTAACTAA GAAGAATTAT TCGGAGTTAG 
27170 27180 27190 27200 27210 27220 27230 

X Ksp632I 
X MboII >< Earl 

x MboII x NlaIIIEamll04I >< 

ATGATGAAGA ACCTATGGAG TTAGATTATC CATAAAACGA ACATGAAAAT TATTCTCTTC CTGACATTGA 
27240 27250 27260 27270 27280 27290 27300 

> < Rsal >< Rsal 
X Csp6I >< Csp6I 
> < Alul >< Mnll > < Afal x Afal 

TTGTATTTAC ATCTTGCGAG CTATATCACT ATCAGGAGTG TGTTAGAGGT ACGACTGTAC TACTAAAAGA 
27310 27320 27330 27340 27350 27360 27370 

X Mnll x Hphl x HphI x Mnll 

ACCTTGCCCA TCAGGAACAT ACGAGGGCAA TTCACCATTT CACCCTCTTG CTGACAATAA ATTTGCACTA 
27380 27390 27400 27410 27420 27430 27440 

. Sau3AI > 

> < PvuII 

> < PspSI 

> < NspBII 
x TthHBBI Ndell > 
X TaqI Mbol > 

>< Raal x Fnu4HI 
x Csp6I DpnII > 

>< R»al x Bbvl BspAI > 

><c Mae I x Afal > < Alul 

ACTTGCACTA GCACACACTT TGCTTTTGCT TGTGCTGACG GTACTCGACA TACCTATCAG CTGCGTGCAA 
27450 27460 27470 27480 27490 27500 " 27510 

X SstI 
X Sdul 
x SacI 
X NspII 
X HgiAI 
x Eco24I 
> < EC1136II 

X BspWI 
x Bspl286I 
x Bmyl 
X Ban I I 
X Alw21I 



X Hphi 

>< Dpnl X MnH 



FIGURE 13. 63 
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>< Bspl43I >< Mnll > < Alul Bbvl >< 

GATCAGTTTC ACCAAAACTT TTCATCAGAC AAGAGGAGGT TCAACAAGAG CTCTACTCGC CACTTTTTCT 
27520 27530 27540 27550 27560 27570 27580 

SstI X 
Sdul >< 
Sad X 
NspII >< 
HgiAI X 
Eco24I >< 
Ecll36II X 
Bspl286I x 
Bmyl >< 

X Rmal >< Tru9I Banll X 

X Mael x Msel >< Tru9I Alw21I X 

x Fnu4HI x HphI X Msel Alul X 

CATTGTTGCT GCTCTAGTAT TTTTAATACT TTGCTTCACC ATTAAGAGAA AGACAGAATG AATGAGCTCA 
27590 27600 27610 27620 27630 27640 27650 

x Tru9I >< T ru9I 

X Msel >< MseI 

CTTTAATTGA CTTCTATTTG TGCTTTTTAG CCTTTCTGCT ATTCCTTGTT TTAATAATGC TTATTATATT 
27660 27670 27680 27690 27700 27710 27720 

X XhoII 
X Xbal 

> < ScrFI 

X Sau3AI 

>< Rmal 
X Ndell 

> < Mval 

x Mfll 
x Mbol 
>< EcoRIlx Mael 

> < Ecll36I 

>< DpnII 

x Dpnl 
X BstYI 

> < BstOI 

> < BstNI 

X TthHB8I X BspAl > < Rsal 

X DsaVX Bspl43I x MboII 

> < BslLI X Csp6I 

x TaqI > < Apyl > < Alwl > < Afal x Nlalll 

TTGGTTTTCA CTCGAAATCC AGGATCTAGA AGAACCTTGT ACCAAAGTCT AAACGAACAT GAAACTTCTC 
27730. 27740 27750 27760 27770 27780 27790 

X HinPlI 
x Hin6I 
X Hhal 
x Rsal x Haell 
. X Sfcl X Eco47III 

x Csp6IX Cfol SfaNI X 
x Ndel x Afal X Bspl43II 

ATTGTTTTGA CTTGTATTTC TCTATGCAGT TGCATATGCA CTGTAGTACA GCGCTGTGCA TCTAATAAAC 
27800 27810 27820 27830 27840 27850 27860 

X XhoII 
x Sau3AI 
X Ndell 
> < Mnll 

X Mfll 



FIGURE 13.64 
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X Mbol 
>< DpnII 

>< Dpnr >< Rsal 

>< BstYI x MboII 
X Nlalllx BspAI x Csp6I x Rmal 

x Alwl x Bspl43I x Afal >< Mael 
CTCATGTGCT TGAAGATCCT TGTAAGGTAC AACACTAGGG GTAATACTTA TAGCACTGCT TGGCTTTGTG 
27870 27880 27890 27900 27910 27920 27930 

X Sdul 
x Rmal 

X NspII 

x Mael 
x HgiAI 

x Bspl286r. >< NspI 

>< Bmyl X NspHI 

x Alw21I >< Nlalll x Maelll 

CTCTAGGAAA GGTTTTACCT TTTCATAGAT GGCACACTAT GGTTCAAACA TGCACACCTA ATGTTACTAT 



27940 279S0 



27960 



27970 



27980 



27990 



28000 



> < XhoII 

> < Sau3AI > < Van91I 

X PvuII 
>< PspSi 

> < Udell > < PflMI 

> < MfllX NspBII 

> < DpnII X HinPlI 

X Bspl43I x Hin6I 

> < BstYI > < Ball X Hhal x Rmal 

> < BspAI > < BsiYIx Cfol x Mael 

> < MboIX Alulx BspWI x BspWI 
X Alwl X Dpnl > < AccB7I X Alul 



X Rsal 
X NlalV 

x l^>nl x Nlalll 
X Eco64I x Maelll 

>< Csp6Ix HphI 
X BscBI >< Eco0651 

>< BanI X BspHI 
x Asp718 >< Eco91I 

X Afal X BstPI 

X AccBlI >< BstEII 

X Acc65I >< Bbvl 



CAACTGTCAA GATCCAGCTG GTGGTGCGCT TATAGCTAGG TGTTGGTACC TTCATGAAGG TCACCAAACT 



28010 



28030 



28040 



28050 



28060 



28070 



x Rsal 
X.Fnu4HI x Maell 
X Esp3I x Csp6I 
X BsraAI X BsmBI 

x Alw26I x Afal 



X Tru9I 
X Msel 
X Dral 



x Slnl 
X Sau96l 
x NspIV 
NspHII X 
NlalV X 

X Eco47I 
X Cfrl3I 
x BsiZI 
BscBI X 



GCTGCATTTA GAGACGTACT TGTTGTTTTA AATAAACGAA CAAATTAAAA TGTCTGATAA TGGACCCCAA 



28080 



28090 



2B110 



28120 . 



x NspII 

X Bspl286I 

X Bmyl 

x Maell >< Acil 



x SinI 
x Sau96I 
x NspIV 
x NspHII 

x NlalV 
x Eco47l 
x Cfrl3I 
X BsiZI 

x BscBI 
>< Bmel8I 
>< Avail x Tfil 
x Asul X Hinfl 



FIGURE 13. 65 
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TCAAACCAAC GTAGTGCCCC CCGCATTACA TTTGGTGGAC CCACAGATTC AACTGACAAT AACCAGAATG 
28150 28160 28170 28180 28190 28200 28210 

X HinPlI >< Styl 
>< Haell 

> < Pall X Hin6l X. EcoTHI 

> < Haelll x Hhaix Ecol30X 

>< BapWI >< BssTlI 

> < BsuRI >< Bspl43II 

>< Hgal> < BshI >< Cfoix BsaJI x Hgal 
GAGGACGCAA TGGGGCAAGG CCAAAACAGC GCCGACCCCA AGGTTTACCC AATAATACTG CGTCTTGGTT 
28220 28230 28240 28250 28260 28270 28280 

■ X TthHB8I 

> < ScrFI 
x Pall 

x PaeR7I 
x NspIH 

> < Mval 
x Haelll 
X EcoRH 

X Eco88I 

X Xhol > < EC1136I 
x D3aV 
. X BsuRI 
x Slal > < BstOI 
X MnllX Taql> < BstNI 
X Ccrl > < BsiLI 
X Hinfl X BshI 
x Tfiix Bcoix BsaJI 
X Mnll x Ddel x Aval > < Apyl 

x Alul >< Ddel > < Nlalll x Bfrl X Ama87l x Mnll 
CACAGCTCTC ACTCAGCATG GCAAGGAGGA ACTTAGATTC CCTCGAGGCC AGGGCGTTCC AATCAACACC 
28290 28300 28310 28320 28330 28340 28350 

x SinI 
X Sau96I 
x NspIV 

X NspHII 
x Eco47I 
x Cxrl3I 
x BsiZl 

>< Bmel8I > < Ksp632I 

x Avail > < Eamll04I 

>< > < Earl > < Alulx MboII >< Maelll 

AATAGTGGTC CAGATGACCA AATTGGCTAC TACCGAAGAG CTACCCGACG AGTTCGTGGT GGTGACGGCA 
28360 28370 28380 28390 28400 28410 28420 



X SstI 








x Sdul 








>< SacI 








X NspII 








X HgiAI 








X Espl 








X Eco24I 






>< Sau96I 


X Ecll36II 




x Styl 


x Pall 


X Ddel 




x Rmal 


x NspIV 


X Celll 




x Mael 


x Haelll 


x Bspl286I 




X EcoT14I 


X Cfrl3l 


X Bpull02I 




x Ecol30I 


x BsuRI 


X Bmyl 




>< BssTlI 


> < BsrI 


X Ban I I 


x Rsal 


X BsaJI 


X Bsi2I 



FIGURE 13. 66 
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x Alw21I >< Csp6I >< Blnl >< Bshlx Hindlll 
X HphI >< Alul >< Afal >< Avrll >< Asul >< Alul 

AAATGAAAGA GCTCAGCCCC AGATGGTACT TCTATTACCT AGGAACTGGC CCAGAAGCTT CACTTCCCTA 
28430 28440 28450 28460 28470 28480 28490 

>< HinPlI 
>< Hin6I 
X Hhal 
X Haell 

X Cfol > < Mnll x NlalV 

X Bspl43II x SfaHI X Ddel X BscBI 

CGGCGCTAAC AAAGAAGGCA TCGTATGGGT TGCAACTGAG GGAGCCTTGA ATACACCCAA AGACCACATT 
28500 28S10 28520 28530 28540 28550 28560 

>< NlalV 
x Eco64I 

X BscBI 
X BanI 

>< Acil 

X AccBlI X Bbvl X Fnu4HI >< Mnll 

GGCACCCGCA ATCCTAATAA CAATGCTGCC ACCGTGCTAC AACTTCCTCA AGGAACAACA TTGCCAAAAG 



x Maell x Mvnl 
>< Mnll BstOI X 

X Fnu4HI X Ksp632I Bsp50I X 

x BspWI >< Earl x BsaAlx Acil 

X Mnll X Mnll X AcilX MboII X Eamll04I AccII X 

GCTTCTACGC AGAGGGAAGC AGAGGCGGCA GTCAAGCCTC TTCTCGCTCC TCATCACGTA GTCGCGGTAA 
28640 28650 28660 28670 28680 28690 28700 

X ScrFI 
X Mval 

X EcoRII >< TthHB8I 

x Ecll36I x Rmal 

x DsaVX Fnu4HI >< Nhel 

X BstOI X Mnll 

>< BstHI >< Mael 

>< BsiLI > < BspWI 

X Apyl X Bbvl x TaqI X Acil 

TTCAAGAAAT TCAACTCCTG GCAGCAGTAG GGGAAATTCT CCTGCTCGAA TGGCTAGCGG AGGTGGTGAA 

28710 28720 28730 28740 28750 28760 28770 

> < Thai 

> < Mvnl 

x HphI x Mnll 

> < HinPlI 

> < Hind 

x Hhal 

> < BstOI x Bmal Pair >< 

> < BspSOI x Mael Haelll x 
X Bbvl X CfoIX Fnu4HI BsuRI X 

> < Accllx BspWI • x Alul BshI x 
ACTGCCCTCG CGCTATTGCT GCTAGACAGA TTGAACCAGC TTGAGAGCAA AGTTTCTGGT AAAGGCCAAC 

28780 28790 28800 28810 28820 28830 28840 

Rsal x 

> < Pallx Maelll >< 

> < Haelll x Fnu4HI Maell X 

> < BsuRI x Ddel x Ddel Csp6I x 
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> < BshI > < Bbvl x Hnll >< BspWI >< sfaHI ACal x 

AACAACAAGG CCAAACTGTC ACTAAGAAAT CTGCTGCTGA GGCATCTAAA AAGCCTCGCC AAAAACGTAC 
28850 28860 28870 28880 28890 28900 28910 

X Tthllll 
>< SinI 
X Sau96l 
X NspIV 
X NspHII 
> < Maell 

X Eco47I 
X Cfrl3I 
X BsmBI 

x Rsal >< BsizI >K styl 

X Maeril X BmelBI X EcoTHI 

x Maell x Esp3I x Avail >< Bcol30I 

X Csp6I >< BsmAI X Asul X BssTlI 

>< Afa * >< Alw26I> < Aspl >< BsaJI 

TGCCACAAAA CAGTACAACG TCACTCAAGC ATTTGGGAGA CGTGGTCCAG AACAAACCCA AGGAAATTTC 
28920 28930 28940 28950 28960 28970 28980 

x sinl 
X Sau96I 
X NspIV 

X NspHII 
, X NlalV 
X Eco47I 
X Cfrl3I 
X BsiZI 

X BscBI 
X Brael8I 
X Avail 
X Asul 



x Pall 
x Haelll 
x Gdlll 

X Fnu4HI 
>< Eael 
>< BsuRI 
X BshI 
X Acil 



BspWI > 
>< BspWI 



GGGGACCAAG ACCTAATCAG ACAAGGAACT GATTACAAAC ATTGGCCGCA AATTGCACAA TTTGCTCCAA 
28990 29000 29010 29020 29030 29040 29050 

>< BsmI >< HialU 
x BscCI x Mali >< Maelll x Maelll >< Nlalll 
GTGCCTCTGC ATTCXTTGGA ATGTCACGCA TTGGCATGGA AGTCACACCT TCGGGAACAT GGCTGACTTA 
29060 29070 ?qnnn 7Qnan oamn r. 



X Tru9I 
X NlalV 
X Nlalll 

X Msel 



x XhoII 
X Sau3AI 
>< Ndell 
x Mfll 
x Mbol 

X Fokl 
X DpnII 

> < Dpnl 
X BstYI 
C BspAI 



>< Tthllll 
X Maell 

BscBI X BstXIX Alwl> < Bspl43I X Aspl BspWI X ' 

TCATGGAGCC ATTAAATTGG ATGACAAAGA TCCACAATTC AAAGACAACG TCATACTGCT GAACAAGCAC 
29130 29140 29150 29160 29170 29180 29190 

Espl X 
Ddel x 
Cell I x 
Bpull02I x 

X Hgal RluI >K 

ATTGACGCAT ACAAAACATT CCCACCAACA GAGCCTAAAA AGGACAAAAA GAAAAAGACT GATGAAGCTC 
29200 29210 29220 29230 29240 29250 29260 

FIGURE 13.68 



Patent Application Publication Nov. 29, 2007 Sheet 82 of 116 US 2007/0275002 Al 



X Plel 

X Fnu4HI >< HboII 

>< BspHI >< MboII X Ksp632I >< Gsul 

>< BsraAI >< Maelll X EarIX Fnu4KI 

X Alw26I >< Hinfl X Eamll04IX Bpml 

>< Acil X Fnu4HI >< Bbtrl >< Acil >< Nlalll 

AGCCTTTGCC GCAGAGACAA AAGAAGCAGC CCACTGTGAC TCTTCTTCCT GCGGCTGACA TGGATGATTT 
29270 29280 29290 293O0 29310 29320 29330 

x Nlalll >< Hinfl Nlalll >< 

X Fokl x Alul >< TfilX Ddel X BspHI 

CTCCAGACAA CTTCAAAATT CCATGAGTGG AGCTTCTGCT GATTCAACTC AGGCATAAAC ACTCATGATG 
29340 29350 29360 29370 29380 29390 29400 

x Maell >< Accl 

ACCACACAAG GCAGATGGGC TATGTAAACG TTTTCGCAAT TCCGTTTACG ATACATAGTC TACTCTTGTG 
29410 29420 29430 29440 29450 29460 29470 

X Tru9I 
X Tru9I 

X Msel 
X Msel 

>< XranI x Hpar 

X EcoRIX Maelll x Hindll Tru9I X 

x Asp700I x Bsgl x Hindi Msel X 

CAGAATGAAT TCTCGTAACT AAACAGCACA AGTAGGTTTA GTTAACTTTA ATCTCACATA GCAATCTTTA 
29480 29490 29500 29510 29520 29530 29540 

Xorll > 
TthH881 > 
TaqI >. 
Sau3AI X 
Rsal X 
x ThalPvuI > 
Ndell X 
X Mnll 
X MvnlMcrl > 
Mbol x 
DpnII X 
Dpnl >< 
Csp6l x 
x BstOI 
x Haelll BspCI > 
BspAI X 
>< TthHBBI x Bsp50I 

x Pall Bspl43I X 
X BsuRI BsiEI > 
>< BshlAXal X 

X Mnll x TaqI x Acil 

x Maelll x Mnll x AccII 

ATCAATGTGT AACATTAGGG AGGACTTGAA AGAGCCRCCA CATTTTCATC GAGGCCACGC GGAGTACGAT 
29550 29560 29570 29580 29590 29600 29610 

x Sdul 
X NspII 

x MboII x Vspl 

X Ksp632I x Eco24I x Tru9I 

x Rsal x Rmal x Fnu4HI x Bspl286I x Msel 

X Csp6I x Mael X Earl x Brayl x Asnl 

x Afal >< Bbvl > < Aluix Eamll04I x Banll x Asel 
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CGAGGGTACA GTGAATAATG CTAGGGAGAG CTGCCTATAT GGAAGAGCCC TAATGTGTAA AATTAATTTT 
29620 29630 29640 29650 29660 29670 29680 

>< Tru9I X Ddel 
>< Ms el >< Bfrl 
>< Nlalll > < Alul 
AGTAGTGCTA TCCCCATGTG ATTTTAATAG CTTCTTAGGA GAATGACAAA AAAAAAAAAA AAAAAA 
29690 29700 29710 29720 29730 29740 
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1-3059 1 CTCTTCTGGAAAAAGC-rAGGCTTATCATTAGAGAAMCAACAGAGTTGTGGTTTCAAGTG 

S-040530 

1-3059 61 ATATTCTTGTTAACAACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTA 

S-040530 1 GG"T"C"C'"""" , »C ,,n C"'' n C n GC'*G""C , " , G nn C""G""C" 

1-3059 121 GTGGT AGTGAC CTTGACCGGTG CACCACTTTTGATGATGTTCAAGCTCCTAATTACACTC 

S-040530 44 "C , "■C• , "C , '" n ""G""""■ , """" ,, "•'■"!C""C""C""C""G""G" 0 C 0 "C n "C" n "■"•C' , 

1-3059 181 AACATACTT_CATCTATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACT 

S-040530 104 "G""C B "CAG""G"_"""C""""CVG"-"""C""C""C""G" , 'C , '""C"GAGC n "" n, 'C 

1-3059 240 CTTTATTTAACTCAGGATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCATACT 

S-040530 163 ""G""CC"G""C"" n,, "CC"G 0 "C""G""C""C" n CAGC""C'*"G° ,, C" n C , *"C" ,, C ,, C 

1-3059 300 ATTAATCATACGTTTGGCAACCCTGTCATACCTTTTAAGGATGGTATTTATTTTGCTGCC 

S-040530 223 ""C n "C n "C""C nn C" nn " ,, »»"C""G Bn C""C""C" ,,n " n C n "C" n C , '"C"-C"-C""- 

1-3059 360 ACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTTGGTTCTACCATGAACAACAAGTCA 

S-040530 283 ""cr"" n, 'GAGC" n C""G""G""G" n C" n,, " ,, G" ,, C"' , CAGC nB,, " M "' , """"""--AGC 

1-3059 420 CAGTCGGTGATTATTATf AACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGAA 

S-040530 343 ""•'AGC"""--C , "C n, 'C" , *"""CAGC" ,, C""C""G""G n "C"''G" , *C , " , C"-"--C""G 

S-040530 403 C" nn "C"" no """ n C nn " nn C nn C ,, "G nn C nn """ B """ nn C nn C" nnn "C" B C n "C""" 

I-30S9 540 ATATTCGATAATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGAI 

S-040530 463 •"C""""' , C nn C n "C""C" 0 C"" ,, "'X nn """"" B " n "CAGC n "C n "" n ''CAGC" , 'G""C 

I-30S9 600 GTTTCAGAAAAGTCAGGTAATTTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGAT 

S-040530 523 BB GAGC""G"""AGC , '"C' ,, 'C""C""G , " , "C"G" ,, G , " ,B " , 'C" ,M ""'C , '"G""C""G , '"C 

1-3059 660 GGGTTTCTCTATGTTTATAAGGGCTATCAACCTATAGATGTAGTTCGTlGATCTACCTTCT 

S-040530 583 ""C""C""G""C""G" , 'C" n " n "-" B C""G""C""C''"C""G»"GA B A" B C""G""CAGC 

1-3059 720 GGTTTTAACACTTTGAAACCTATTTTTAAGTTGCCTCTTGGTATTAACATTACAAATTTT 

S-040530 643 nB C n "C B "" nn CC B """G""C n "C B "C'"'"C"" ,M, C , "'G 0 "C""C n " , "" , C""C""C""C 

1-3059 780 AGAGCCATTCTTACAGCCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCC 

S-040530 703 C"G"""""C , *"G""C'"""""'AGC"" , '""C"■ , G ^ " ,, ■" , C , "'"" B " ^ "CAGC B "C" B C'" , " 

1-3059 840 TATTTTGTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACA 

S-040530 763 ""C B "C""G" n,, ""CC"G"""*" , T""C""C""C ,,,, "" n G" , "" , "C''"C , '"G""C"' , C ,,,, C 

1-3059 900 ATCACAGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAG 

S-040530 823 """ BB C' , "C" n C" , *G' ,B C'" , CAGC' , "G""C""C n "G , ' B C""G , '"G""G , ' n "AGC n "G" 0 " • 

1-3059 960 AGCTTTGAGATTGACAAAGGMTTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCaGGA 

S-040530 883 """""C nn """C"* ,B,, "G""C""C ,, """" B "" B AGC n,, C H " , ' , '"A , " , G' ,n G'' n TAGC n "C 

1-3059 1020 GATGTTGTGAGATTCCCTAATATTACAAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCT 

S-040530 943 •""•"•?G n ""C"G" , """'C" B " B "C , "'C B ""C BBBB C BB C" B C B "C nB A ,,B G , " , C B,, C B "C 

S-040530 1003 BB C Bn G" BB ""CAGC""G""C""C B "" , '" B C B G BB G" B G" n ^GC^C^ B C""G"C^"C 

1-3059 1140 TACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTOTCTGCC 

S-040530 1063 " , •"AGC B, ' B " , •G BBB " B^ " ,, C'' ,, C""C""CAGC'" ,B " B C'""' BBBB "C'"" , " B GAGC BB • , 

1-3059 1200 ACTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGA 

S-040530 1123 ' , "C BB, 'C n " B,, C' ,B C , " , G nOB " B "AG'"" , C , "'G'" , C , " , C" B CAGC'"'C" B G B "G"" , "" , C 

S-040530 1183 " n C B "C""G"" BBB G BB C""C BB T B "C n "G°"C"^ 
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1-3059 1320 TTGCC^GATGATTTCATGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACT 

S-040530 1243 C """"C , "C""C n """ 0 "*"C°"C n "G""G-"C C " n CC n,, °""""C''"C'"'C"A 

1-3059 1380 TCAACTGGTAATTATAATTATAAATATAGGTATCT7AGACATGGCAAGCTTAGGCCCTTT 

S-040530 1303 AGC""C n "C""C ,, °C"" ,, ""C n,, G , **CC"C" D C nn GC' , G" n C"" n ""°""GC"" B "- ,,, C 

1-3059 1440 GAGAGAGACATATCTAATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACCTGCr 

S-040530 1363 ""»C"G"""""C"C , "'C"" ,,,, ' , C"""AG"""C" n C , *"''""G""C" , "'"""""C"""""C 

1-3059 1500 CTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCACTACTGGCATTGGCTAC 

S-040530 1423 ""G""C""C , "'C°" , "*".CC"G""C''"C' , "C , '"C""C"" , "'""" n C" n C , '"" 0 "C" nn, '"T 

1-3059 1560 CAACCTTACAGAGTTGTAGTACTTTCTTTTGAACrnTAAATGCACCGGCCACGGTTTGT 

S-040530 1483 ""G""C n "- , """"G""G""G n "GAGC""C""G , "'GC"G""C'" , C*" , T" , ' ,, °"C""G" a C 

1-3059 1620 GGACCAAAATTATCCACTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGA 

S-040530 1543 ""C""C""GC"GAG""-C"""""G""C""""""""-""C"-G""C""C""C""C""C""C 

1-3059 1680 CTCACTGGTACTGGTGTGTTAACTCCTTCTTCA AAGAGATTTCAACCATTTCAACAAT 

S-04O530 1603 '•»G'"-C" , 'C n "C"°C"""C n G 0 "C" n »AG ,, "GC"""C n C""C" n G D "C ,, '*C""G""G" 

1-3059 1738 TTGGCCGTGATGTCTCTGATTT(^CTGATTCCGTTCGAGATCCTAAAACATCTGAAATAT 

S-040530 1661 "C n "°"G"""""GAGC , "'C" n " , '"C""CAG"""G , '"G""C""C""G n "CAGC n,, G" B CC 

1-3059 1798 TAGACATTTCACCTTGCTCTTTTGGGGGTGTAAGTGTAATTACACCTGGAACAAATGCTT 

S-040530 1721 n G"""""CAGC""C"— AGC""C" , *C""C n,, GTCC""G- ,, C" , 'C° n C""C"' , C n "C" n CA 

1-3059 1858 _CATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGCACTGATGTTTCTACAGCAATC 

S-040530 1781 G""G"_"-"•'"G""C""G""G""C"••G""C""G'"'"""""»C-"C""GAGC"-C"»C••"■ 

1-3059 1917 CATGCAGATCAACTCACACCAGCTTGGCGCATATATTCTACTGGAAACAATGTATTCCAG 

S-040530 1840 ""C""C""C"-G""G" ,, C""C n "C n """"G n "C n "CAGC , '"C n "G"-"""C""G""-"»» 

1-3059 1977 ACTCAAGCAGGCTGTCTTATAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATT 

S-040530 1900 ■"•C" n G""C"""""C""G""C""C""C""" , '"C""G , "'"""CAGC""C""""' , " , " , " , '"C 

1-3059 2037 CCTATTGGAGCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGC 

S-040530 1960 ""C""""''C""C""A"''C""C""C""C"""""C n "C , '"GAGCC"GC"G""G*'"C""C B "" 

1-3059 2097 CAAAAATCTATTGTGGCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCT 

S-040S30 2020 »"G""G""C""C"""""C""C""C n " n AGCC"G" n C ,,n C""C""CAGC' , "C""C"""AGC 

1-3059 2157 AATAACACCATTGCTATACCTACTAACTTTTCAATTAGCATTACTACAGAAGTAATGCCT 

S-040530 2080 •"•C n """ ,, """C""C""C""C""C"""'*"CAGC° , 'CTC"""C' ,n C""C" n """G n """"C 

1-3059 2217 GTTTCTATGGCTAAAACCTCCGTAGATTGTAATATGTACATCTGCGGAGATTCTACTGAA 

S-040530 2140 ■">GAGC" ,,n ' M 'C""G""AAG* , ""G , " , * , ""C n "C ■■«■■■ '""»c""CAGC""C" n G 

1-3059 2277 TGTGCTAATTTGCTTCTCCAATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCA 

S-040530 2200 "■X""C" , 'CC""""G n "G" , 'G" ,, C""C"" n ""C'*" , '''"C" ,, G""G""C""G""C n "GAGC 

1-3059 2337 GGTATTGCTGCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAATG 

S-040530 2260 •■"C""C""C""C" n G n """ ,, C""G ,, ""''"CA"A"""''"""""""C' , """"G""G""G" n " 

1-3059 2397 TACAAAACCCCAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGAC 

S-040530 2320 ""T" ,, G"' , ''"'*C""CC , •"" ,, G'"'C""C" n G""C""C"'•C ,, "C""T"»G""CC• , G n "C■"'" 

1-3059 2457 CCTCTAAAGCCAACTAAGAGGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTC 

S-040530 2380 " ,,n,,, 'G"""" n C'"'C"""C"C""C""C"''C""' , "' ,,, C' , " n °G n "C'" , C n "A"""""C""G 

1-3059 2517 GCTGATGCTGGCTTCATGAAGCAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGAT 

S-040530 2440 "-C"' , C" n C""" ,, "T , '"""""""G H "C" n "''"G"""""G""C , '"C""C"''C n "CC"G' , »C 

1-3059 2577 CTCATTTGTGCGCAGAAGTTCAATGGGCTTACAGTGTTGCCACCTCTGCTCACTGATGAT 

S-040530 2500 ""G""C""C""C""""""""T""C 0 "" , " , G""C" ,, "C""" n C , " , C" n " ,,, 'G'" , C""C" , 'C 

1-3059 2637 ATGATTGCTGCCTACACTGCTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATTT 

S-040530 2560 """"• , C""C n """ n T""A""C" n C , '"G""G""C""C" n C" , '"""C , " , C""C" ,, " , '"C""C 

FIGURE 32.2 
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1-3059 2697 GGTGCTGGC6CTGCTCTTCRARTACCTTTT6CTATGCABATGGCATATAGGTTCAATGGC 

S-040530 2620 ''"A" n C""A 0 "C"■ I C ,, '■G""G""C-"C""C , •"C n °" n^ G ^, "" , "C°"CC" n ■■■'"•' , *C• , *■ 

1-3059 2757 ATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAAC 

S-040530 2680 ■"'C" n C"°G"""" "G""C 0 "G""G""C n """ n """G""G""G""""" ,,0 " , " , "G"' , C" 0 " 

1-3059 2817 AAGGCGATTAGTCAAATTCAAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTG 

S-040530 2740 ""'•»»c"-C'* ,, C""G°"C"-G""GAGC"-G""C""" , " ,, CAGC""C""CC , " , """" n "" B " 

1-3059 2877 CAAGACGTTGTTAACCAGAATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCT 

S-040530 2800 -"G"""''"G ,, "G""° , "'"""C ,, "C""G""CC"G" , " , " M C , "'G , '"G , " , G" 0 G'"'G ,, " n AGC 

1-3059 2937 AATTTTGGTGCAATTTCAAGTGTGCTAAATGATATCCTTTCGCGACTTGATAAAGTCGftG 

S-040530 2860 »"C B "C""C , " , C""CAGCTC" , ' n "" n G ,,n C" ,, C" unnn GAGCA"G" n G , r B C" n, " n "G""" 

1-3059 2997 GCGGAGGTACAAATTGACAGGCTAATTACAGGCAGACTTCAAAGCCTTCAAACCTATGTA 

S-040530 2920 "«c n "A ,,n G" ,, G°"C"" n C" no,, G nn C* , "C n "AC , 'C ,,,, G""GTC"" I 'G , "'G" n """C""G 

1-3059 3057 ACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCTCCTAATCTTGCTGCTACTAAA 

S-040530 2980 «»c , ' ,, G , *"G"' , G" n ' , " ,, A" n C"' , C ,,,, G , '""C n " nn CAGC" n C"" n '" , G'" , C" n C" n C'* 0 G 

1-3059 3117 ATGTCTGAGTGTGTTCTTGGACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCAC 

S-040530 3040 -»"AGC"" n ""C""G'' ,, G" , *C , ' n GAGC""G n ' , " ,,n G n, "" , "C" 0 C'"'C' ,B """" n "T"" 

1-3059 3177 CrTATGTCCTTCCCflCAAGCAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTG 

S-040530 3100 »»G' , ""AG" n """"C•'"G■"'C"• , """C n ''C" n C n "G""G*• n " n "G"•'C'"'G" B C" B C' , -" 

1-3059 3237 CCATCCCAGGAGAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATAC 

S-040530 . 3160 ""TAG-""" 1 "'^ B »»»»»»» C n« C C -« Cn n c n» c »n G n nn »,. G „„ C «« D 

1-3059 3297 TTCCCTCGTGAAGGTGTTTTTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAAC 

S-040530 3220 '•n«'"'C""G ,,n G n "C"' , G ,, °C" B,, " n """C" M ' ,, *"CAGC' ,n " ,,,, C ,, "C""C ,,, * ,, C B C B "" 

1-3059 3357 TTCTTTTCTCCACAAATAATTACTACAGACAATACATTTGTCTCAGGAAATTGTGATGTC ' 

S-040530 3280 """'•"CAGC""C""G""C , "'C ,,n C"""" n,, ""C n "C ,, "C n "G""C""C" n C" ,, """""."G 

1-3059 3417 GTTAlTGGCATCATTAACAACACAGrTTATGATCCTCTGCAACCTGAGCTTGACTCATTC 

S-040530 3340 •"•G""C""""- ,,, "C , "'T''"" n "C""G n "C""C" B C"""»"G n "C"'"" M 'G''""AGC B "" 

1-3059 3477 AAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATGTTGATCTTGGCGAC 

S-040530 3400 "«G""G""""" , '""''""A"" , "'" , '""G" , 'C""C''"C-"C" ,, C" n C""G" , 'C''"G""" ,, "T 

1-3059 3537 ATTTCAGGCATTAACGCTTCTGTCGTCMCATTCAAAAAGAAATTGACCGCCTCAATGAG 

S-040530 3460 ""CAGC"'' n ""C n, '""''C' , ''C" ,, G""G"""''"C" n G"' , G n ''G""C n ""A B A B ''G ,,,, C""A 

1-3059 3597 GTCGCTAAAAACTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAA 

S-040530 3520 "»G" B C ,, "G''"CC' , G n "C""GAGC''"G n "C ,, "" , "'G" n G n "GC , " , " n C ,, "G' , "C"" B " 0 G 

1-3059 3657 TATATTAAATGGCCTTGGTATG 

S-040530 3580 ' " 

1-3059 3717 t 

S-040530 3640 "»»""G»»c BB "C" BnH G" B C BB " B """"C""C""C B "T""C"" ,, ""G" B A" B C""C""» 

1-3059 3777 TCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTC 

S-040530 3700 AGC" B ' , " ,, CAGC'"*""""""" , "'C" ,, C" nn, "'C n, "'AGC'""" , "C"G" ,, G , " , " ,,n C""G 

1-3059 3837 AAATTACATTACAWTAAACGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTGGAT 

S-040530 3760 »»GC"G""C n """ B C n G"T__ ,,,, " B CGA" 

1-3059 .3897 < 
S-040530 
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USE OF PROTEINS AND PEPTIDES ENCODED BY 
THE GENOME OF A NOVEL SARS-ASSOCIATED 
CORONAVTRUS STRAIN 

[0001] The present invention relates to a novel strain of 
severe acute respiratory syndrome (SARS)-associated coro- 
navirus derived from a sample recorded under No. 031589 
and collected in Hanoi (Vietnam), to nucleic acid molecules 
derived from its genome, to the proteins and peptides 
encoded by said nucleic acid molecules and to their appli- 
cations, in particular as diagnostic reagents and/or as vac- 

[0002] Coronavirus is a virus containing single-stranded 
RNA, of positive polarity, of approximately 30 kilobases 
which replicates in the cytoplasm of the host cells; the 5' end 
of the genome has a capped structure and the 3' end contains 
a polyA tail. This virus is enveloped and comprises, at its 
surface, peplomeric structures called spicules. 

[0003] The genome comprises the following open reading 
frames or ORFs, from its 5' end to its 3' end: ORFla and 
ORFlb corresponding to the proteins of the transcription- 
replication complex, and ORF-S, ORF-E, ORF-M and 
ORF-N corresponding to the structural proteins S, E, M and 
N. It also comprises ORFs corresponding to proteins of 
unknown function encoded by: the region situated between 
ORF-S and ORF-E and overlapping the latter, the region 
situated between ORF-M and ORF-N, and the region 
included in ORF-N. 

[0004] The S protein is a membrane glycoprotein (200- 
220 kDa) which exists in the form of spicules or spikes 
emerging from the surface of the viral envelope. It is 
responsible for the attachment of the virus to the receptors 
of the host cell and for inducing the fusion of the viral 
envelope with the cell membrane. 

[0005] The small envelope protein (E), also called sM 
(small membrane), which is a nonglycosylated transmem- 
brane protein of about 10 kDa, is the protein present in the 
smallest quantity in the virion. It plays a powerful role in the 
coronavirus budding process which occurs at the level of the 
intermediate compartment in the endoplasmic reticulum and 
the Golgi apparatus. 

[0006] The M protein or matrix protein (25-30 kDa) is a 
more abundant membrane glycoprotein which is integrated 
into the viral particle by an M/E interaction, whereas the 
incorporation of S into the particles is directed by an S/M 
interaction. It appears to be important for the viral matura- 
tion of coronaviruses and for the determination of the site 
where the viral particles are assembled. 

[0007] The N protein or nucleocapsid protein (45-50 kDa) 
which is the most conserved among the coronavirus struc- 
tural proteins is necessary for encapsidating the genomic 
RNA and then for directing its incorporation into the virion. 
This protein is probably also involved in the replication of 
the RNA. 

[0008] When the host cell is infected, the reading frame 
(ORF) situated in 5' of the viral genome is translated into a 
polyprotein which is cleaved by the viral proteases and then 
releases several nonstructural proteins such as the RNA- 
dependent RNA polymerase (Rep) and the ATPase helicase 
(Hel). These two proteins are involved in the replication of 
the viral genome and in the generation of transcripts which 



are used in the synthesis of the viral proteins. The mecha- 
nisms by which these subgenomic mRNAs are produced are 
not completely understood; however, recent facts indicate 
that the sequences for regulation of transcription at the 5' end 
of each gene represent signals which regulate the discon- 
tinuous transcription of the subgenomic mRNAs. 

[0009] The proteins of the viral membrane (S, E and M 
proteins) are inserted into the intermediate compartment, 
whereas the replicated RNA (+ strand) is assembled with the 
N (nucleocapsid) protein. This protein-RNA complex then 
combines with the M protein contained in the membranes of 
the endoplasmic reticulum and the viral particles form when 
the nucleocapsid complex buds into the endoplasmic reticu- 
lum. The virus then migrates across the Golgi complex and 
eventually leaves the cell, for example by exocytosis. The 
site of attachment of the virus to the host cell is at the level 
of the S protein. 

[0010] Coronaviruses are responsible for 15 to 30% of 
colds in humans and for respiratory and digestive infections 
in animals, especially cats (FIPV: Feline infectious perito- 
nitis virus), poultry (IBV: Avian infectious bronchitis virus), 
mice (MHV: Mouse hepatitis virus), pigs (TGEV: Transmis- 
sible gastroenterititis virus, PEDV: Porcine Epidemic diar- 
rhea virus, PRCoV: Porcine Respiratory Coronavirus, HEV: 
Hemagglutinating encephalomyelitis Virus) and bovines 
(BCoV: Bovine coronavirus). 

[0011] In general, each coronavirus affects only one spe- 
cies; in immunocompetent individuals, the infection induces 
optionally neutralizing antibodies and cell immunity, 
capable of destroying the infected cells. 
[0012] An epidemy of atypical pneumonia, called severe 
acute respiratory syndrome (SARS) has spread in various 
countries (Vietnam, Hong Kong, Singapore, Thailand and 
Canada) during the first quarter of 2003, from an initial 
focus which appeared in China in the last quarter of 2002. 
The severity of this disease is such that its mortality rate is 
about 3 to 6%. The determination of the causative agent of 
this disease is underway by numerous laboratories world- 

[0013] In March 2003, a new coronavirus (SARS-CoV or 
SARS virus) was isolated, in association with cases of 
severe acute respiratory syndrome (T. G. KSIAZEK et al., 
The New England Journal of Medicine, 2003, 348, 1319- 
1330; C. DROSTEN et al., The New England Journal of 
Medicine, 2003, 348, 1967-1976; Peiris et al., Lancet, 2003, 
361, 1319). 

[0014] Genomic sequences of this new coronavirus have 
thus been obtained, in particular those of the Urbani isolate 
(Genbank accession No. AY2741 19.3 and A. MARRA et al., 
Science, May 1, 2003, 300, 1399-1404) and the Toronto 
isolate (Tor2. Genbank accession No. AY278741 and A. 
ROTA et al. Science, 2003, 300, 1394-1399). 

[0015] The organization of the genome is comparable with 
that of other known coronaviruses, thus making it possible 
to confirm that SARS-CoV belongs to the Coronaviridae 
family; open reading frames ORFla and lb and open 
reading frames corresponding to the S, E, M and N proteins, 
and to proteins encoded by: the region situated between 
ORF-S and ORF-E (ORF3), the region situated between 
ORF-S and ORF-E and overlapping ORF-E (ORF4), the 
region situated between ORF-M and ORF-N (ORF7 to 
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ORFll) and the region corresponding to ORF-N (ORF13 
and ORF14), have in particular been identified. 

[0016] Seven differences have been identified between the 
sequences of the Tor2 and Urbani isolates; 3 correspond to 
silent mutations (c/t at position 16622 and a/g at position 
19064 of ORFlb, t/c at position 24872 of ORF-S) and 4 
modify the amino acid sequence of respectively: the proteins 
encoded by ORFla (c/t at position 7919 corresponding to 
the A/V mutation), the S protein (g/t at position 23220 
corresponding to the A/S mutation), the protein encoded by 
ORF3 (a/g at position 25298 corresponding to the R/G 
mutation) and the M protein (t/c at position 26857 corre- 
sponding to the S/P mutation). 

[0017] In addition, phylogenetic analysis shows that 
SARS-CoV is distant from other coronaviruses and that it 
did not appear by mutation of human respiratory coronavi- 
ruses nor by recombination between known coronaviruses 
(for a review, see Holmes, J. C. I., 2003, 111, 1605-1609). 

[0018] The determination and the taking into account of 
new variants are important for the development of reagents 
for the detection and diagnosis of SARS which are suffi- 
ciently sensitive and specific, and immunogenic composi- 
tions capable of protecting populations against epidemics of 
SARS. 

[0019] The inventors have now identified another strain of 
SARS-associated coronavirus which is distinguishable from 
the Tor2 and Urbani isolates. 

[0020] The subject of the present invention is therefore an 
isolated or purified strain of severe acute respiratory syn- 
drome-associated human coronavirus, characterized in that 
its genome has, in the form of complementary DNA, a serine 
codon at position 23220-23222 of the gene for the S protein 
or a glycine codon at position 25298-25300 of the gene for 
ORF3, and an alanine codon at position 7918-7920 of 
ORFla or a serine codon at position 26857-26859 of the 
gene for the M protein, said positions being indicated in 
terms of reference to the Genbank sequence AY274119.3. 

[0021] According to an advantageous embodiment of said 
strain, the DNA equivalent of its genome has a sequence 
corresponding to the sequence SEQ ID No: 1; this coro- 
navirus strain is derived from the sample collected from the 
bronchoaleveolar washings from a patient suffering from 
SARS, recorded under the No. 031589 and collected at the 
Hanoi (Vietnam) French hospital. 

[0022] In accordance with the invention, said sequence 
SEQ ID No: 1 is that of the deoxyribonucleic acid corre- 
sponding to the ribonucleic acid molecule of the genome of 
the isolated coronavirus strain as defined above. 

[0023] The sequence SEQ ID No: 1 is distinguishable 
from the Genbank sequence AY274119.3 (Tor2 isolate) in 
that it possesses the following mutations: 

[0024] g/t at position 23220; the alanine codon (get) at 
position 577 of the amino acid sequence of the Tor2 S 
protein is replaced by a serine codon (tct), 

[0025] a/g at position 25298; the arginine codon (aga) at 
position 11 of the amino acid sequence of the protein 
encoded by the Tor2 ORF3 is replaced by a glycine 
codon (gga). 



[0026] In addition, the sequence SEQ ID No: 1 is distin- 
guishable from the Genbank sequence AY278741 (Urbani 
isolate) in that it possesses the following mutations: 
[0027] t/c at position 7919; the valine codon (gtt) in 
position 2552 of the amino acid sequence of the protein 
encoded by ORFla is replaced by an alanine codon 
(get), 

[0028] t/c at position 16622: this mutation does not 
modify the amino acid sequence of the proteins 
encoded by ORFlb (silent mutation), 

[0029] g/a at position 19064: this mutation does not 
modify the amino acid sequence of the proteins 
encoded by ORFlb (silent mutation), 

[0030] c/t at position 24872: this mutation does not 
modify the amino acid sequence of the S protein, and 
c/t at position 26857: the proline codon (ccc) at position 
154 of the amino acid sequence of the M protein is 
replaced by a serine codon (tec). 

[0031 ] Unless otherwise stated, the positions of the nucle- 
otide and peptide sequences are indicated with reference to 
the Genbank sequence AY274119.3. 

[0032] The subject of the present invention is also an 
isolated or purified polynucleotide, characterized in that its 
sequence is that of the genome of the isolated coronavirus 
strain as defined above. 

[0033] According to an advantageous embodiment of said 
polynucleotide, it has the sequence SEQ ID No: 1 . 

[0034] The subject of the present invention is also an 
isolated or purified polynucleotide, characterized in that its 
sequence hybridizes under high stringency conditions with 
the sequence of the polynucleotide as defined above. 

[0035] The terms "isolated or purified" mean modified "by 
the hand of humans" from the natural state; in other words 
if an object exists in nature, it is said to be isolated or 
purified if it is modified or extracted from its natural 
environment or both. For example, a polynucleotide or a 
protein/peptide naturally present in a living organism is 
neither isolated nor purified; on the other hand, the same 
polynucleotide or protein/peptide separated from coexisting 
molecules in its natural environment, obtained by cloning, 
amplification and/or chemical synthesis is isolated for the 
purposes of the present invention. Furthermore, a polynucle- 
otide or a protein/peptide which is introduced into an 
organism by transformation, genetic manipulation or by any 
other method, is "isolated" even if it is present in said 
organism. The term purified as used in the present invention 
means that the proteins/peptides according to the invention 
are essentially free of association with the other proteins or 
polypeptides, as is for example the product purified from the 
culture of recombinant host cells or the product purified 
from a nonrecombinant source. 

[0036] For the purposes of the present invention, high 
stringency hybridization conditions are understood to mean 
temperature and ionic strength conditions chosen such that 
they make it possible to maintain the specific and selective 
hybridization between complementary polynucleotides. 

[0037] By way of illustration, high stringency conditions 
for the purposes of defining the above polynucleotides are 
advantageously the following: the DNA-DNA or DNA- 
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RNA hybridization is performed in two steps: (1) prehy- 
bridization at 42° C. for 3 hours in phosphate buffer (20 mM, 
pH 7.5) containing 5xSSC (lxSSC corresponds to a 0.15 M 
NaCl+0.015 M sodium citrate solution), 50% formamide, 
7% sodium dodecyl sulfate (SDS), lOxDenhardt's, 5% 
dextran sulfate and 1% salmon sperm DNA; (2) hybridiza- 
tion for 20 hours at 42° C. followed by 2 washings of 20 
minutes at 20° C. in 2xSSC+2% SDS, 1 washing of 20 
minutes at 20° C. in 0.1xSSC+0.1% SDS. The final washing 
is performed in 0.1xSSC+0.1% SDS for 30 minutes at 60° 
C. 

[0038] The subject of the present invention is also a 
representative fragment of the polynucleotide as defined 
above, characterized in that it is capable of being obtained 
either by the use of restriction enzymes whose recognition 
and cleavage sites are present in said polynucleotide as 
defined above, or by amplification with the aid of oligo- 
nucleotide primers specific for said polynucleotide as 
defined above, or by transcription in vitro, or by chemical 
synthesis. 

[0039] According to an advantageous embodiment of said 
fragment, it is selected from the group consisting of: the 
cDNA corresponding to at least one open reading frame 
(ORF) chosen from: ORFla, ORFlb, ORF-S, ORF-E. ORF- 
M, ORF-N, ORF3, ORF4, ORF7 to ORF11, ORF13 and 
ORF 14 and the cDNA corresponding to the noncoding 5' or 
3' ends of said polynucleotide. 

[0040] According to an advantageous feature of this 
embodiment, said fragment has a sequence selected from the 
group consisting of: 

[0041] the sequences SEQ ID NO: 2 and 4 representing 
the cDNA corresponding to the ORF-S which encodes 
the S protein, 

[0042] the sequences SEQ ID NO: 13 and 1 5 represent- 
ing the cDNA corresponding to the ORF-E which 
encodes the E protein, 

[0043] the sequences SEQ ID NO: 1-6 and 18 repre- 
senting the cDNA corresponding to the ORF-M which 
encodes the M protein, 

[0044] the sequences SEQ ID NO: 36 and 38 represent- 
ing the cDNA corresponding to the ORF-N winch 
encodes the N protein, 

[0045] the sequences representing the cDNA corre- 
sponding respectively: to ORFla and ORFlb (ORFlab, 
SEQ ID NO: 31), to ORF3 and ORF4 (SEQ ID NO: 7, 
8), to ORF7 to 1 1 (SEQ ID NO: 1 9, 20) to ORF 1 3 (SEQ 
ID NO: 32) and to ORF 14 (SEQ ID NO: 34), and 

[0046] the sequences representing the cDNAs corre- 
sponding respectively to the noncoding 5' (SEQ ID NO: 
39 and 72) and 3' (SEQ ID NO: 40, 73) ends of said 
polynucleotide. 

[0047] The subject of the present invention is also a cDNA 
fragment encoding the S protein, as defined above, charac- 
terized in that it has a sequence selected from the group 
consisting of the sequences SEQ ID NO: 5 and 6 (Sa and Sb 
fragments). 

[0048] The subject of the present invention is also a cDNA 
fragment corresponding to ORFla and ORFlb as defined 



above, characterized in that it has a sequence selected from 
the group consisting of the sequences SEQ ID NO: 41 to 54 
(LO to L12 fragments). 

[0049] The subject of the present invention is also a 
polynucleotide fragment as defined above, characterized in 
that it has at least 1 5 consecutive bases or base pairs of the 
sequence of the genome of said strain including at least one 
of those situated in position 7979, 16622, 19064, 23220, 
24872, 25298 and 26857. Preferably this is a fragment of 20 
to 2500 bases or base pairs, preferably from 20 to 400. 

[0050] According to an advantageous embodiment of said 
fragment, it includes at least one pair of bases or base pairs 
corresponding to the following positions: 7919 and 23220, 
791 9 and 25298, 1 6622 and 23220, 1 9064 and 23220, 1 6622 
and 25298, 19064 and 25298, 23220 and 24872, 23220 and 
26857, 24872 and 25298, 25298 and 26857. 

[0051] The subject of the present invention is also primers 
of at least 18 bases capable of amplifying a fragment of the 
genome of a SARS-associated coronavirus or of the DNA 
equivalent thereof. 

[0052] According to an embodiment of said primers, they 
are selected from the group consisting of: 

[0053] the pair of primers No. 1 corresponding respec- 
tively to positions 28507 to 28522 (sense primer, SEQ 
ID NO: 60) and 28774 to 28759 (antisense primer, SEQ 
ID NO: 61) of the sequence of the polynucleotide as 
defined above, 

[0054] the pair of primers No. 2 corresponding respec- 
tively to positions 28375 to 28390 (sense primer, SEQ 
ID NO: 62) and 28702 to 28687 (antisense primer, SEQ 
ID NO: 63) of the sequence of the polynucleotide as 
defined above, and 

[0055] the pair of primers consisting of the primers SEQ 
ID Nos: 55 and 56. 

[0056] The subject of the present invention is also a probe 
capable of detecting the presence of the genome of a 
SARS-associated coronavirus or of a fragment thereof, 
characterized in that it is selected from the group consisting 
of: the fragments as defined above and the fragments cor- 
responding to the following positions of the polynucleotide 
sequence as defined above: 28561 to 28586, 28588 to 28608, 
28541 to 28563 and 28565 to 28589 (SEQ ID NO: 64 to 67). 

[0057] The probes and primers according to the invention 
may be labeled directly or indirectly with a radioactive or 
nonradioactive compound by methods well known to per- 
sons skilled in the art so as to obtain a detectable and/or 
quantifiable signal. Among the radioactive isotopes used, 
there may be mentioned 32 P, 33 P, 35 S, 3 II or 125 I. The 
nonradioactive entities are selected from ligands such as 
biotin, avidin, streptavidin, digoxygenin, haptens, dyes, 
luminescent agents such as radioluminescent, chemolumi- 
nescent, bioluminescent, fluorescent and phosphorescent 

[0058] The invention encompasses the labeled probes and 
primers derived from the preceding sequences. 

[0059] Such probes and primers are useful for the diag- 
nosis of infection by a SARS-associated coronavirus. 



US 2007/0275002 Al 



4 



Nov. 29, 2007 



[0060] The subject of the present invention is also a 
method for the detection of a SARS-associated coronavirus, 
from a biological sample, which method is characterized in 
that it comprises at least: 

[0061] (a) the extraction of nucleic acids present in said 
biological sample, 

[0062] (b) the amplification of a fragment of ORF-N by 
RT-PCR with the aid of a pair of primers as defined above, 
and 

[0063] (c) the detection, by any appropriate means, of the 
amplification products obtained in (b). 

[0064] The amplification products (amplicons) in (b) are 
268 bp for the pair of primers No. 1 and 328 bp for the pair 
of primers No. 2. 

[0065] According to an advantageous embodiment of said 
method, the step (b) of detection is carried out with the aid 
of at least one probe corresponding to positions 28561 to 
28586, 28588 to 28608, 28541 to 28563 and 28565 to 28589 
of the sequence of the polynucleotide as defined above. 

[0066] Preferably, the SARS-associated coronavirus 
genome is detected and optionally quantified by PCR in real 
time with the aid of the pair of primers No. 2 and probes 
corresponding to positions 28541 to 28563 and 28565 to 
28589 labeled with different compounds, in particular dif- 
ferent fluorescent agents. 

[0067] The real time RT-PCR which uses this pair of 
primers and this probe is very sensitive since it makes it 
possible to detect 1 02 copies of RNA and up to 1 0 copies of 
RNA; it is in addition reliable and reproducible. 

[0068] The invention encompasses the single-stranded, 
double-stranded and triple-stranded polydeoxyribonucle- 
otides and polyribonucleotides corresponding to the 
sequence of the genome of the isolated strain of coronavirus 
and its fragments as defined above, and to their sense or 
antisense complementary sequences, in particular the RNAs 
and cDNAs corresponding to the sequence of the genome 
and of its fragments as defined above. 

[0069] The present invention also encompasses the ampli- 
fication fragments obtained with the aid of primers specific 
for the genome of the purified or isolated strain as defined 
above, in particular with the aid of primers or pairs of 
primers as defined above, the restriction fragments formed 
by or comprising the sequence of fragments as defined 
above, the fragments obtained by transcription in vitro from 
a vector containing the sequence SEQ ID NO: 1 or a 
fragment as defined above, and fragments obtained by 
chemical synthesis. Examples of restriction fragments are 
deduced from the restriction map of the sequence SEQ ID 
NO: 1 illustrated by FIG. 13. In accordance with the 
invention, said fragments are either in the form of isolated 
fragments, or in the form of mixtures of fragments. The 
invention also encompasses fragments modified, in relation 
to the preceding ones, by removal or addition of nucleotides 
in a proportion of about 15%, relative to the length of the 
above fragments and/or modified in terms of the nature of 
the nucleotides, as long as the modified nucleotide frag- 
ments retain a capacity for hybridization with the genomic 
or antigenomic RNA sequences of the isolate as defined 



[0070] The nucleic acid molecules according to the inven- 
tion are obtained by conventional methods, known per se, 
following standard protocols such as those described in 
Current Protocols in Molecular Biology (Frederick M. 
AUSUBEL, 2000, Wiley and son Inc., Library of Congress, 
USA). For example, they may be obtained by amplification 
of a nucleic sequence by PCR or RT-PCR or alternatively by 
total or partial chemical synthesis. 

[0071] The subject of the present invention is also a DNA 
or RNA chip or filter, characterized in that it comprises at 
least one polynucleotide or one of its fragments as defined 
above. 

[0072] The DNA or RNA chips or filters according to the 
invention are prepared by conventional methods, known per 
se, such as for example chemical or electrochemical grafting 
of oligonucleotides on a glass or nylon support. 
[0073] The subject of the present invention is also a 
recombinant cloning and/or expression vector, in particular 
a plasmid, a virus, a viral vector or a phage comprising a 
nucleic acid fragment as defined above. Preferably, said 
recombinant vector is an expression vector in which said 
nucleic acid fragment is placed under the control of appro- 
priate elements for regulating transcription and translation. 
In addition, said vector may comprise sequences (tags) fused 
in phase with the 5' and/or 3' end of said insert, which are 
useful for the immobilization and/or detection and/or puri- 
fication of the protein expressed from said vector. 
[0074] These vectors are constructed and introduced into 
host cells by conventional recombinant DNA and genetic 
engineering methods which are known per se. Numerous 
vectors into which a nucleic acid molecule of interest may 
be inserted in order to introduce it and to maintain it in a host 
cell are known per se; the choice of an appropriate vector 
depends on the use envisaged for this vector (for example 
replication of the sequence of interest, expression of this 
sequence, maintenance of the sequence in extrachromo- 
somal form or alternatively integration into the chromo- 
somal material of the host), and on the nature of the host cell . 
[0075] In accordance with the invention, said plasmid is 
selected in particular from the following plasmids: 

[0076] the plasmid, called SARS-S, contained in the 
bacterial strain deposited under the No. 1-3059, on Jun. 
20, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA sequence encod- 
ing the S protein of the SARS-CoV strain derived from 
the sample recorded under the No. 031589, said 
sequence corresponding to the nucleotides at positions 
21406 to 25348 (SEQ ID NO: 4), with reference to the 
Genbank sequence AY274119.3, 

[0077] the plasmid, called SARS-S 1, contained in the 
bacterial strain deposited under the No. 1-3020, on May 
12, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains a 5' fragment of the cDNA 
sequence encoding the S protein of the SARS-CoV 
strain derived from the sample recorded under the No. 
03 1 589, as defined above, said fragment corresponding 
to the nucleotides at positions 21406 to 23454 (SEQ ID 
NO: 5), with reference to the Genbank sequence 
AY274119.3 Tor2, 
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[0078] the plasmid, called SARS-S2, contained in the 
bacterial strain deposited under the No. 1-3019, on May 

12, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains a 3' fragment of the cDNA 
sequence encoding the S protein of the SARS-CoV 
strain derived from the sample recorded under the 
number No. 031589, as defined above, said fragment 
corresponding to the nucleotides at positions 23322 to 
25348 (SEQ ID NO: 6), with reference to the Genbank 
sequence accession No. AY274119.3, 

[0079] the plasmid, called SARS-SE, contained in the 
bacterial strain deposited under the No. 1-3 1 26, on Nov. 

13, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA corresponding to 
the region situated between ORF-S and ORF-E and 
overlapping ORF-E of the SARS-CoV strain derived 
from the sample recorded under the No. 031589, as 
defined above, said region corresponding to the nucle- 
otides at positions 25110 to 26244 (SEQ ID NO: 8), 
with reference to the Genbank sequence accession No. 
AY274119.3, 

[0080] the plasmid, called SARS-E, contained in the 
bacterial strain deposited under the No. 1-3046, on May 
28, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA sequence encod- 
ing the E protein of the SARS-CoV strain derived from 
the sample recorded under the No. 031589, as defined 
above, said sequence corresponding to the nucleotides 
at positions 26082 to 26413 (SEQ ID NO: 15), with 
reference to the Genbank sequence accession No. 
AY274119.3, 

[0081] the plasmid, called SARS-M, contained in the 
bacterial strain deposited under the No. 1-3047, on May 
28, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA sequence encod- 
ing the M protein of the SARS-CoV strain derived from 
the sample recorded under the No. 031589, as defined 
above; said sequence corresponding to the nucleotides 
at positions 26330 to 27098 (SEQ ID NO: 18), with 
reference to the Genbank sequence accession No. 
AY274119.3, 

[0082] the plasmid, called SARS-MN, contained in the 
bacterial sequence deposited under the No. 1-3125, on 
Nov. 13, 2003, at the Collection Nationale de Cultures 
de Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA sequence corre- 
sponding to the region situated between ORF-M and 
ORF-N of the SARS-CoV strain derived from the 
sample recorded under the No. 03 1 589 and collected in 
Hanoi, as defined above, said sequence corresponding 
to the nucleotides at positions 26977 to 28218 (SEQ ID 
NO: 20), with reference to the Genbank accession No. 
AY274119.3, 

[0083] the plasmid, called SARS-N, contained in the 
bacterial strain deposited under the No. 1-3048, on Jun. 
5, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA encoding the N 



protein of the SARS-CoV strain derived from the 
sample recorded under the No. 031589, as defined 
above, said sequence corresponding to the nucleotides 
at positions 28054 to 29430 (SEQ ID NO: 38), with 
reference to the Genbank sequence accession No. 
AY274119.3; thus, this plasxnid comprises an insert of 
sequence SEQ ID NO: 38 and is contained in a bacterial 
strain which was deposited under the No. 1-3048, on 
Jun. 5, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15, 

[0084] the plasmid, called SARS-5'NC, contained in the 
bacterial strain deposited under the No. 1-3 1 24, on Nov. 
7, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 1 5; it contains the cDNA corresponding to 
the noncoding 5' end of the genome of the SARS-CoV 
strain derived from the sample recorded under the No. 
031589, as defined above, said sequence corresponding 
to the nucleotides at positions 1 to 204 (SEQ ID NO: 
39), with reference to the Genbank sequence accession 
No.AY274119.3, 

[0085] the plasmid called SARS-3'NC, contained in the 
bacterial strain deposited under the No. 1-3123 on Nov. 
7, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 
Paris Cedex 15; it contains the cDNA sequence corre- 
sponding to the noncoding 3' end of the genome of the 
SARS-CoV strain derived from the sample recorded 
under the No. 031589, as defined above, said sequence 
corresponding to that situated between the nucleotide 
and position 28933 to 29727 (SEQ ID NO: 40), with 
reference to the Genbank sequence accession No. 
AY274119.3, ends with a series of nucleotides a., 

[0086] the expression plasmid, called pIV2.3N, con- 
taining a cDNA fragment encoding a C-tenninal fusion 
of the N protein (SEQ ID NO: 37) with a polyhistidine 
tag, 

[0087] the expression plasmid, called pIV2.3S c .. con- 
taining a cDNA fragment encoding a C-terminal fusion 
of the fragment corresponding to positions 475 to 1193 
of the amino acid sequence of the S protein (SEQ ID 
NO: 3) with a polyhistidine tag, 

[0088] the expression plasmid, pIV2.3S L , containing a 
cDNA fragment encoding a C-terminal fusion of the 
fragment corresponding to positions 14 to 1193 of the 
amino acid sequence of the S protein (SEQ ID NO: 3) 
with a polyhistidine tag, 

[0089] the expression plasmid, called pIV2.4N, con- 
taining a cDNA fragment encoding a N-tenninal fusion 
of the N protein (SEQ ID NO: 3) with a polyhistidine 
tag, 

[0090] the expression plasmid, called pIV2.4S c or 
pIV2.4S 1; containing an insert encoding a N-terminal 
fusion of the fragment corresponding to positions 475 
to 1193 of the amino acid sequence of the S protein 
(SEQ ID NO: 3) with a polyhistidine tag, and 

[0091] the expression plasmid, called pIV2.4S L , con- 
taining a cDNA fragment encoding an N-terminal 
fusion of the fragment corresponding to positions 14 to 



US 2007/0275002 Al 



6 



Nov. 29, 2007 



1 1 93 of the amino acid sequence of the S protein (SEQ 
ID NO: 3) with a polyhistidine tag. 

[0092] According to an advantageous feature of the 
expression plasmid as defined above, it is contained in a 
bacterial strain which was deposited under the No. 1-3117, 
on Oct. 23, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 Paris 
Cedex 15. 

[0093] According to another advantageous feature of the 
expression plasmid as defined above, it is contained in a 
bacterial strain which was deposited under the No. 1-3118, 
on Oct. 23, 2003, at the Collection Nationale de Cultures de 
Microorganismes, 25 rue du Docteur Roux, 75724 Paris 
Cedex 15. 

[0094] According to another feature of the expression 
plasmid as defined above, it is contained in a bacterial strain 
which was deposited at the CNCM, 25 rue du Docteur Roux, 
75724 Paris Cedex 15 under the following numbers: 



[0095] 


a) str 


ainNo. 1-3118, deposited o. 


n Oct. 23, 2003, 


[0096] 


b)str 


ain No. 1-3019, deposited oi 


a May 12, 2003, 


[0097] 


c)str 


ain No. 1-3020, deposited oi 


i May 12, 2003, 


[0098] 


d)str 


ain No. 1-3059, deposited o 


n Jun. 20, 2003, 


[0099] 


e) strain No. 1-3323, deposited on Nov. 22, 2004, 


[0100] 


f) str; 


jin No. 1-3324, deposited oi 


a Nov. 22, 2004, 


[0101] 


g)sh 


ain No. 1-3326, deposited c 


in Dec. 1, 2004, 


[0102] 


h) str 


ain No. 1-3327, deposited c 


»n Dec. 1, 2004, 


[0103] 


i)str, 


ain No. 1-3332, deposited c 


•a Dec. 1, 2004, 


[0104] 


j) ste 


ain No. 1-3333, deposited c 


•n Dec. 1, 2004, 


[0105] 


k) St. 


ain No. 1-3334, deposited c 


in Dec. 1, 2004, 


[0106] 


1) str. 


ain No. 1-3335, deposited c 


aa Dec. 1, 2004, 


[0107] 


m)st 


rain No. 1-3336, deposited < 


an Dec. 1,2004, 


[0108] 


n)sn 


•ain No. 1-3337, deposited c 


in Dec. 1, 2004, 


[0109] 


0)8b 


ain No. 1-3338, deposited c 


in Dec. 2, 2004, 


[0110] 


P) str 


ain No. 1-3339, deposited c 


>n Dec. 2, 2004, 


[0111] 


q)sti 


ain No. 1-3340, deposited c 


in Dec. 2, 2004, 


[0112] 


r) strain No. 1-3341, deposited c 


m Dec. 2, 2004. 


[0113] The subject of the present inve 
nucleic acid insert of viral origin, character 


ized in that it is 



contained in any of the strains as defined above in a)-r). 
[0114] The subject of the present invention is also a 
nucleic acid containing a synthetic gene allowing optimized 
expression of the S protein in eukaryotic cells, characterized 
in that it possesses the sequence SEQ ID NO: 140. 
[0115] The subject of the present invention is also an 
expression vector containing a nucleic acid containing a 
synthetic gene allowing optimized expression of the S 
protein, which vector is contained in the bacterial strain 
deposited at the CNCM, on Dec. 1, 2004, under the No. 
1-3333. 

[0116] According to one embodiment of said expression 
vector, it is a viral vector, in the form of a viral particle or 
in the form of a recombinant genome. 



[0117] According to an advantageous feature of this 
embodiment, this is a recombinant viral particle or a recom- 
binant viral genome capable of being obtained by transfec- 
tion of a plasmid according to paragraphs g), h) and k) to r) 
as defined above, in an appropriate cellular system, that is to 
say, for example, cells transfected with one or more other 
plasmids intended to transcomplenaent certain functions of 
the virus that are deleted in the vector and that are necessary 
for the formation of the viral particles. 

[0118] The expression "S protein family" is understood 
here to mean the complete S protein, its ectodomain and 
fragments of this ectodomain which are preferably produced 
in a eukaryotic system. 

[0119] The subject of the present invention is also a 
lentiviral vector encoding a polypeptide of the S protein 
family, as defined above. 

[0120] The subject of the present invention is also a 
recombinant measles virus encoding a polypeptide of the S 
protein family, as defined above. 

[0121] The subject of the present invention is also a 
recombinant vaccinia virus encoding a polypeptide of the S 
protein family, as defined above. 

[0122] The subject of the present invention is also the use 
of a vector according to paragraphs e) to r) as defined above, 
or of a vector containing a synthetic gene for the S protein, 
as defined above, for the production, in a eukaryotic system, 
of the SARS-associated coronavirus S protein or of a 
fragment of this protein. 

[0123] The subject of the present invention is also a 
method for producing the S protein in a eukaryotic system, 
comprising a step of transfecting eukaryotic cells in culture 
with a vector chosen from the vectors contained in the 
bacterial strains mentioned in paragraphs e) to r) above or a 
vector containing a synthetic gene allowing optimized 
expression of the S protein. 

[0124] The subject of the present invention is also a cDNA 
library characterized in that it comprises fragments as 
defined above, in particular amplification fragments or 
restriction fragments, cloned into a recombinant vector, in 
particular an expression vector (expression library). 

[0125] The subject of the present invention is also cells, in 
particular prokaryotic cells, modified by a recombinant 
vector as defined above. 

[0126] The subject of the present invention is also a 
genetically modified eukaryotic cell expressing a protein or 
a polypeptide as defined above. Quite obviously, the terms 
"genetically modified eukaryotic cell" do not denote a cell 
modified with a wild-type virus. 

[0127] According to an advantageous embodiment of said 
cell, it is capable of being obtained by transfection with any 
of the vectors mentioned in paragraphs K) to N) above. 

[0128] According to an advantageous feature of this 
embodiment, this is the cell FRhK4-Ssol-30, deposited at the 
CNCM on Nov. 22, 2004, under the No. 1-3325. 

[0129] The recombinant vectors as defined above and the 
cells transformed with said expression vectors are advanta- 
geously used for the production of the corresponding pro- 
teins and peptides. The expression libraries derived from 
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said vectors, and the cells transformed with said expression 
libraries are advantageously used to identify the immuno- 
genic epitopes (B and T epitopes) of the SARS-associated 
coronavirus proteins. 

[0130] The subject of the present invention is also the 
purified or isolated proteins and peptides, characterized in 
that they are encoded by the polynucleotide or one of its 
fragments as defined above. 

[0131] According to an advantageous embodiment of the 
invention, said protein is selected from the group consisting 
of: 

[0132] the S protein having the sequence SEQ ID NO: 
3 or its ectodomain 

[0133] the E protein having the sequence SEQ ID NO: 
14 

[0134] the M protein having the sequence SEQ ID NO: 
17 

[0135] the N protein having the sequence SEQ ID NO: 
37 

[0136] the proteins encoded by the ORFs: ORE la, 
ORFlb, ORF3, ORF4 and ORF7 to ORF11, ORF13 
and ORF14 and having the respective sequence, SEQ 
ID NO: 74, 75, 10, 12, 22, 24, 26, 28, 30, 33 and 35. 

[0137] The terms "ectodomain of the S protein" and 
"soluble form of the S protein" will be used interchangeably 

[0138] According to an advantageous embodiment of the 
invention, said polypeptide consists of the amino acids 
corresponding to positions 1 to 1193 of the amino acid 
sequence of the S protein. 

[0139] According to another advantageous embodiment of 
the invention, said peptide is selected from the group con- 
sisting of: 

[0140] a) the peptides corresponding to positions 14 to 
1193 and 475 to 1193 of the amino acid sequence of the S 
protein, 

[0141] b) the peptides corresponding to positions 2 to 14 
(SEQ ID NO: 69) and 100 to 221 of the amino acid sequence 
of the M protein; these peptides correspond respectively to 
the ectodomain and to the endodomain of the M protein, and 

[0142] c) the peptides corresponding to positions 1 to 12 
(SEQ ID NO: 70) and 53 to 76 (SEQ ID NO: 71) of the 
amino acid sequence of the E protein; these peptides corre- 
spond respectively to the ectodomain and to the C-terminal 
end of the E protein, and 

[0143] d) the peptides of 5 to 50 consecutive amino acids, 
preferably of 10 to 30 amino acids, inclusive or partially or 
completely overlapping the sequence of the peptides as 
defined in a), b) or c). 

[0144] The subject of die present invention is also a 
peptide, characterized in that it has a sequence of 7 to 50 
amino acids including an amino acid residue selected from 
the group consisting of: 

[0145] the alanine situated at position 2552 of the amino 
acid sequence of the protein encoded by ORFla, 



[0146] the serine situated at position 577 of the amino 
acid sequence of the S protein of the SARS-CoV strain 
as defined above, 

[0147] the glycine at position 11 of the amino acid 
sequence of the protein encoded by ORF3 of the 
SARS-CoV strain as defined above, 

[0148] the serine at position 154 of the amino acid 
sequence of the M protein of the SARS-CoV strain as 
defined above. 

[0149] The subject of the present invention is also an 
antibody or a polyclonal or monoclonal antibody fragment 
which can be obtained by immunization of an animal with 
a recombinant vector as defined above, a cDNA library as 
denned above or alternatively a protein or a peptide as 
denned above, characterized in that it binds to at least one 
of the proteins encoded by SARS-CoV as defined above. 

[0150] The invention encompasses the polyclonal antibod- 
ies, the monoclonal antibodies, the chimeric antibodies such 
as the humanized antibodies, and fragments thereof (Fab, Fv, 
scFv). 

[0151] A subject of the present invention is also a hybri- 
doma producing a monoclonal antibody against the N pro- 
tein, characterized in that it is chosen from the following 
hybridomas: 

[0152] the hybridoma producing the monoclonal anti- 
body 87, deposited at the CNCM on Dec. 1 , 2004 under 
the number 1-3328, 

[0153] the hybridoma producing the monoclonal anti- 
body 86, deposited at the CNCM on Dec. 1 , 2004 under 
the number 1-3329, 

[0154] the hybridoma producing the monoclonal anti- 
body 57, deposited at the CNCM on Dec. 1 , 2004 under 
the number 1-3330, and 

[0155] the hybridoma producing the monoclonal anti- 
body 156, deposited at the CNCM on Dec. 1, 2004 
under the number 1-3331. 

[0156] The subject of the present invention is also a 
polyclonal or monoclonal antibody or antibody fragment 
directed against the N protein, characterized in that it is 
produced by a hybridoma as defined above. 
[0157] For the purposes of the present invention, the 
expression chimeric antibody is understood to mean, in 
relation to an antibody of a particular animal species or of a 
particular class of antibody, an antibody comprising all or 
part of a heavy chain and/or of a light chain of an antibody 
of another animal species or of another class of antibody. 
[0158] For the purposes of the present invention, the 
expression humanized antibody is understood to mean a 
human immunoglobulin in which the residues of the CDRs 
(Complementary Determining Regions) which form the 
antigen-binding site are replaced by those of a nonhuman 
monoclonal antibody possessing the desired specificity, 
affinity or activity. Compared with the nonhuman antibodies, 
the humanized antibodies are less immunogenic and possess 
a prolonged half-life in humans because they possess only a 
small proportion of nonhuman sequences given that practi- 
cally all the residues of the FR (Framework) regions and of 
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the constant (Fc) region of these antibodies are those of a 
consensus sequence of human immunoglobulins. 
[0159] A subject of the present invention is also a protein 
chip or filter, characterized in that it comprises a protein, a 
peptide or alternatively an antibody as defined above. 

[0160] The protein chips according to the invention are 
prepared by conventional methods known per se. Among the 
appropriate supports on which proteins may be immobilized, 
there may be mentioned those made of plastic or glass, in 
particular in the form of microplates. 

[0161] The subject of the present invention is also reagents 
derived from the isolated strain of SARS-associated coro- 
navirus, derived from the sample recorded under the No. 
031589, which are useful for the study and diagnosis of the 
infection caused by a SARS-associated coronavirus, said 
reagents are selected from the group consisting of: 

[0162] (a) a pair of primers, a probe or a DNA chip as 
defined above, 

[0163] (b) a recombinant vector or a modified cell as 
defined above, 

[0164] (c) an isolated coronavirus strain or a polynucle- 
otide as defined above, 

[0165] (d) a protein or a peptide as defined above, 

[0166] (e) an antibody or an antibody fragment as 
defined above, and 

[0167] (f) a protein chip as defined above. 

[0168] These various reagents are prepared and used 
according to conventional molecular biology and immunol- 
ogy techniques following standard protocols such as those 
described in Current Protocols in Molecular Biology (Fre- 
derick M. AUSUBEL, 2000, Wiley and Sou Inc., Library of 
Congress, USA), in Current Protocols in Immunology (John 
E. Cologan, 2000, Wiley and Son Inc., Library of Congress, 
USA) and in Antibodies: A Laboratory Manual (E. Howell 
and D. Lane, Cold Spring Harbor Laboratory, 1988). 

[0169] The nucleic acid fragments according to the inven- 
tion are prepared and used according to conventional tech- 
niques as defined above. The peptides and proteins accord- 
ing to the invention are prepared by recombinant DNA 
techniques, known to persons skilled in the art, in particular 
with the aid of the recombinant vectors as defined above. 
Alternatively, the peptides according to the invention may be 
prepared by conventional techniques of solid or liquid phase 
synthesis, known to persons skilled in the art. 

[0170] The polyclonal antibodies are prepared by immu- 
nizing an appropriate animal with a protein or a peptide as 
defined above, optionally coupled to KLH or to albumin 
and/or combined with an appropriate adjuvant such as 
(complete or incomplete) Freund's adjuvant or aluminum 
hydroxide; after obtaining a satisfactory antibody titer, the 
antibodies are harvested by collecting serum from the immu- 
nized animals and enriched with IgG by precipitation, 
according to conventional techniques, and then the IgGs 
specific for the SARS-CoV proteins are optionally purified 
by affinity chromatography on an appropriate column to 
which said peptide or said protein is attached, as defined 
above, so as to obtain a monospecific IgG preparation. 



[0171] The monoclonal antibodies are produced from 
hybridomas obtained by fusion of B lymphocytes from an 
animal immunized with a protein or a peptide as defined 
above with myelomas, according to the Kohler and Milstein 
technique (Nature, 1975, 256, 495-497); the hybridomas are 
cultured in vitro, in particular in fermenters or produced in 
vivo, in the form of ascites; alternatively, said monoclonal 
antibodies are produced by genetic engineering as described 
in American patent U.S. Pat. No. 4,816,567. 

[01 72] The humanized antibodies are produced by general 
methods such as those described in International application 
WO 98/45332. 

[0173] The antibody fragments are produced from the 
cloned V H and V L regions, from the rnRNAs of hybridomas 
or splenic lymphocytes of an immunized mouse; for 
example, the Fv, scFv or Fab fragments are expressed at the 
surface of filamentous phages according to the Winter and 
Milstein technique (Nature, 1991, 349, 293-299); after sev- 
eral selection steps, the antibody fragments specific for the 
antigen are isolated and expressed in an appropriate expres- 
sion system, by conventional techniques for cloning and 
expression of recombinant DNA. 

[0174] The antibodies or fragments thereof as defined 
above are purified by conventional techniques known to 
persons skilled in the art, such as affinity chromatography. 

[0175] The subject of the present invention is additionally 
the use of a product selected from the group consisting of: 
a pair of primers, a probe, a DNA chip, a recombinant vector, 
a modified cell, an isolated coronavirus strain, a polynucle- 
otide, a protein or a peptide, an antibody or an antibody 
fragment and a protein chip as defined above, for the 
preparation of a reagent for the detection and optionally 
genotyping/sero typing of a SARS-associated coronavirus. 

[0176] The proteins and peptides according to the inven- 
tion, which are capable of being recognized and/or of 
inducing the production of antibodies specific for the SARS- 
associated coronavirus, are useful for the diagnosis of infec- 
tion with such a coronavirus; the infection is detected, by an 
appropriate technique — in particular EIA, ELISA, RIA, 
immunofluorescence — , in a biological sample collected 
from an individual capable of being infected. 

[0177] According to an advantageous feature of said use, 
said proteins are selected from the group consisting of the S, 
E, M and/or N proteins and the peptides as defined above. 

[0178] The S, E, M and/or N proteins and the peptides 
derived from these proteins as defined above, for example 
the N protein, are used for the indirect diagnosis of a 
SARS-associated coronavirus infection (serological diagno- 
sis; detection of an antibody specific for SARS-CoV), in 
particular by an immunoenzymatic method (ELISA). 

[0179] The antibodies and antibody fragments according 
to the invention, in particular those directed against the S, E, 
M and/or N proteins and the derived peptides as defined 
above, are useful for the direct diagnosis of a SARS- 
associated coronavirus infection; the detection of the pro- 
tein(s) of SARS-CoV is carried out by an appropriate 
technique, in particular EIA, ELISA, RIA, immunofluores- 
cence, in a biological sample collected from an individual 
capable of being infected. 
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[0180] The subject of the present invention is also a 
method for the detection of a SARS-associated coronavirus, 
from a biological sample, which method is characterized in 
that it comprises at least: 

[0181] (a) bringing said biological sample into contact 
with at least one antibody or one antibody fragment, 
one protein, one peptide or alternatively one protein or 
peptide chip or filter as defined above, and 

[0182] (b) visualizing by any appropriate means anti- 
gen-antibody complexes formed in (a), for example by 
EIA, ELISA, RIA, or by immunofluorescence. 
[0183] According to one advantageous embodiment of 
said process, step (a) comprises: 

[0184] (aj) bringing said biological sample into contact 
with at least a first antibody or an antibody fragment 
which is attached to an appropriate support, in particu- 
lar a microplate, 

[0185] (a 2 ) washing the solid phase, and 

[0186] (a 3 ) adding at least a second antibody or an 
antibody fragment, different from the first, said anti- 
body or antibody fragment being optionally appropri- 
ately labeled. 

[0187] This method, which makes it possible to capture 
the viral particles present in the biological sample, is also 
called immunocapture method. 
[0188] For example: 
[0189] step (aj is carried out with at least a first 
monoclonal or polyclonal antibody or a fragment 
thereof, directed against the S, M and/or E protein, 
and/or a peptide corresponding to the ectodomain of 
one of these proteins (M2-14 or El-12 peptides) 
[0190] step (a 3 ) is carried out with at least one antibody 
or an antibody fragment directed against another 
epitope of the same protein or preferably against 
another protein, preferably against an inner protein 
such as the N nucleoprotein or the endodomain of the 
E or M protein, more preferably still these are antibod- 
ies or antibody fragments directed against the N protein 
which is very abundant in the viral particle; when an 
antibody or an antibody fragment directed against an 
inner protein (N) or against the endodomain of the E or 
M proteins is used, said antibody is incubated in the 
presence of detergent, such as Tween 20 for example, 
at concentrations of the order of 0.1%. 
[0191] step (b) for visualizing the antigen-antibody 
complexes formed is carried out, either directly with 
the aid of a second antibody labeled for example with 
biotin or an appropriate enzyme such as peroxidase or 
alkaline phosphatase, or indirectly with the aid of an 
anti-immunoglobulin serum labeled as above. The 
complexes thus formed are visualized with the aid of an 
appropriate substrate. 
[0192] According to a preferred embodiment of this aspect 
of the invention, the biological sample is mixed with the 
visualizing monoclonal antibody prior to its being brought 
into contact with the capture monoclonal antibodies. Where 
appropriate, the serum-visualizing antibody mixture is incu- 
bated for at least 10 minutes at room temperature before 
being applied to the plate. 



[0193] The subject of the present invention is also an 
immunocapture test intended to detect an infection by the 
SARS-associated coronavirus by detecting the native nucle- 
oprotein (N protein), in particular characterized in that the 
antibody used for the capture of the native viral nucleopro- 
tein is a monoclonal antibody specific for the central region 
and/or for a conformational epitope. 
[0194] According to one embodiment of said test, the 
antibody used for the capture of the N protein is the 
monoclonal antibody mAb87, produced by the hybridoma 
deposited at the CNCM on Dec. 1, 2004 under the number 
1-3328. 

[0195] According to another embodiment of said immu- 
nocapture test, the antibody used for the capture of the N 
protein is the monoclonal antibody mAb86, produced by the 
hybridoma deposited at the CNCM on Dec. 1, 2004 under 
the number 1-3329. 

[0196] According to another embodiment of said immu- 
nocapture test, the monoclonal antibodies mAb86 and 
mAb87 are used for the capture of the N protein. 

[0197] In the immunocapture tests according to the inven- 
tion, it is possible to use, for visualizing the N protein, the 
monoclonal antibody mAb57, produced by the hybridoma 
deposited at the CNCM on Dec. 1, 2004 under the number 
1-3330, said antibody being conjugated with a visualizing 
molecule or particle. 

[0198] In accordance with said immunocapture test, a 
combination of the antibodies mAb57 and mAb87, conju- 
gated with a visualizing molecule or particle, is used for the 
visualization of the N protein. 

[0199] A visualizing molecule may be a radioactive atom, 
a dye, a fluorescent molecule, a fluorophore, an enzyme; a 
visualizing particle may be for example: colloidal gold, a 
magnetic particle or a latex bead. 

[0200] The subject of the present invention is also a 
reagent for detecting a SARS-associated coronavirus, char- 
acterized in that it is selected from the group consisting of: 

[0201] (a) a pair of primers or a probe as defined above, 

[0202] (b) a recombinant vector as defined above or a 
modified cell as defined above, 

[0203] (c) an isolated coronavirus strain as defined 
above or a polynucleotide as defined above, 

[0204] (d) an antibody or an antibody fragment as 
defined above, 

[0205] (e) a combination of antibodies comprising the 
monoclonal antibodies mAb86 and/or mAb87, and the 
monoclonal antibody mAb57, as defined above, 

[0206] (f) a chip or a filter as defined above. 

[0207] The subject of the present invention is also a 
method for the detection of a SARS-associated coronavirus 
infection, from a biological sample, by indirect IgG ELISA 
using the N protein, which method is characterized in that 
the plates are sensitized with an N protein solution at a 
concentration of between 0.5 and 4 ug/ml, preferably to 2 
ug/ml, in a 10 mM PBS buffer pH 7.2, phenol red at 0.25 
ml/1. 
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[0208] The subject of the present invention is additionally 
a method for the detection of a SARS-associated coronavi- 
rus infection, from a biological sample, by double epitope 
ELSA, characterized in that the serum to be tested is mixed 
with the visualizing antigen, said mixture then being brought 
into contact with the antigen attached to a solid support. 

[0209] According to one variant of the tests for detecting 
SARS-associated coronaviruses, these tests combine an 
ELSA using the N protein, and another ELSA using the S 
protein, as described below. 

[0210] The subject of the present invention is also an 
immune complex formed of a polyclonal or monoclonal 
antibody or antibody fragment as defined above, and of a 
SARS-associated coronavirus protein or peptide. 
[0211] The subject of the present invention is additionally 
a SARS-associated coronavirus detection kit, characterized 
in that it comprises at least one reagent selected from the 
group consisting of: a pair of primers, a probe, a DNA or 
RNA chip, a recombinant vector, a modified cell, an isolated 
coronavirus strain, a polynucleotide, a protein or a peptide, 
an antibody, and a protein chip as defined above. 

[0212] The subject of the present invention is additionally 
an immunogenic composition, characterized in that it com- 
prises at least one product selected from the group consisting 
of: 

[0213] a) a protein or a peptide as defined above, 
[0214] b) a polynucleotide of the DNA or RNA type or 

one of its representative fragments as defined above, 

having a sequence chosen from: 

[0215] (i) the sequence SEQ ID NO: 1 or its RNA 
equivalent 

[0216] (ii) the sequence hybridizing under high strin- 
gency conditions with the sequence SEQ ID NO: 1, 

[0217] (iii) the sequence complementary to the 
sequence SEQ ID NO: 1 or to the sequence hybridizing 
under high stringency conditions with the sequence 
SEQ ID NO: 1, 

[0218] (iv) the nucleotide sequence of a representative 
fragment of the polynucleotide as defined in (i), (ii) or 
(Hi), 

[0219] (v) the sequence as defined in (i), (ii), (iii) or (iv), 
modified, and 

[0220] c) a recombinant expression vector comprising a 

polynucleotide as defined in b), and 
[0221] d) a cDNA library as defined above, 

said immunogenic composition being capable of inducing 
protective humoral or cellular immunity specific for the 
SARS-associated coronavirus, in particular the produc- 
tion of an antibody directed against a specific epitope of 
the SARS-associated coronavirus. 

[0222] The proteins and peptides as defined above, in 
particular the S, M, E and/or N proteins and the derived 
peptides, and the nucleic acid (DNA or RNA) molecules 
encoding said proteins or said peptides are good candidate 
vaccines and may be used in immunogenic compositions for 
the production of a vaccine against the SARS-associated 



[0223] According to an advantageous embodiment of the 
compositions according to the invention, they additionally 
contain at least one pharmaceutically acceptable vehicle and 
optionally carrier substances and/or adjuvants. 

[0224] The pharmaceutically acceptable vehicles, the car- 
rier substances and the adjuvants are those conventionally 

[0225] The adjuvants are advantageously chosen from the 
group consisting of oily emulsions, saponin, mineral sub- 
stances, bacterial extracts, aluminum hydroxide and 

[0226] The carrier substances are advantageously selected 
from the group consisting of unilamellar liposomes, multi- 
lamellar liposomes, micelles of saponin or solid micro- 
spheres of a saccharide or auriferous nature. 
[0227] The compositions according to the invention are 
administered by the general route, in particular by the 
intramuscular or subcutaneous route or alternatively by the 
local, in particular nasal (aerosol) route. 
[0228] The subject of the present invention is also the use 
of an isolated or purified protein or peptide having a 
sequence selected from the group consisting of the 
sequences SEQ ID NO: 3, 10, 12, 14, 17, 22, 24, 26, 28, 30, 
33, 35, 37, 69, 70, 71 , 74 and 75 to form an immune complex 
with an antibody specifically directed against an epitope of 
the SARS-associated coronavirus. 

[0229] The subject of the present invention is also an 
immune complex consisting of an isolated or purified pro- 
tein or peptide having a sequence selected from the group 
consisting of the sequences SEQ ID NO: 3, 10, 12, 14, 17, 
22, 24, 26, 28, 30, 33, 35, 37, 69, 70, 71, 74 and 75, and of 
an antibody specifically directed against an epitope of the 
SARS-associated coronavirus. 

[0230] The subject of the present invention is also the use 
of an isolated or purified protein or peptide having a 
sequence selected from the group consisting of the 
sequences SEQ ID NO: 3, 10, 12, 14, 17, 22, 24, 26, 28, 30, 
33, 35, 37, 69, 70, 71 , 74 and 75 to induce the production of 
an antibody capable of specifically recognizing an epitope of 
the SARS-associated coronavirus. 

[0231] The subject of the present invention is also the use 
of an isolated or purified polynucleotide having a sequence 
selected from the group consisting of the sequences SEQ ID 
NO: 1, 2, 4, 7, 8, 13, 15, 16, 18, 19, 20, 31, 36 and 38 to 
induce the production of an antibody directed against the 
protein encoded by said polynucleotide and capable of 
specifically recognizing an epitope of the SARS-associated 
coronavirus. 

[0232] The subject of the present invention is also mono- 
clonal antibodies recognizing the native S protein of a 
SARS-associated coronavirus. 

[0233] The subject of the present invention is also the use 
of a protein or a polypeptide of the S protein family, as 
defined above, or of an antibody recognizing the native S 
protein, as defined above, to detect an infection by a SARS- 
associated coronavirus, in a biological sample. 

[0234] The subject of the present invention is also a 
method for detecting an infection by a SARS-associated 
coronavirus, in a biological sample, characterized in that the 
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detection is carried i 
protein, expressed ii 



it by ELISA using the recombinant S 
a eukaryotic system. 



TABLE I-continued 



[0235] According to an advantageous embodiment of said 
method, it is a double epitope ELISA method, and the serum 
to be tested is mixed with the visualizing antigen, said 
mixture then being brought into contact with the antigen 
attached to a solid support. 



[0236] The subject of the present invention is also an 
immune complex consisting of a monoclonal antibody or 
antibody fragment recognizing the native S protein, and of 
a protein or a peptide of the SARS-associated coronavirus. 

[0237] The subject of the present invention is also an 
immune complex consisting of a protein or a polypeptide of 
the S protein family, as defined above, and of an antibody 
specifically directed against an epitope of the SARS-asso- 



[0238] The subject of the present invention is additionally 
a SARS-associated coronavirus detection kit or box, char- 
acterized in that it comprises at least one reagent selected 
from the group consisting of: a protein or polypeptide of the 
S protein family, as defined above, a nucleic acid encoding 
a protein or peptide of the S protein family, as defined above, 
a cell expressing a protein or polypeptide of the S protein 
family, as defined above, or an antibody recognizing the 
native S protein of a SARS-associated coronavirus. 

[0239] The subject of the present invention is an immu- 
nogenic and/or vaccine composition, characterized in that it 
comprises a polypeptide or a recombinant protein of the S 
protein family, as defined above, obtained in a eukaryotic 
expression system. 

[0240] The subject of the present invention is also an 
immunogenic and/or vaccine composition, characterized in 
that it comprises a vector or recombinant virus, expressing 
a protein or a polypeptide of the S protein family, as defined 

[0241] In addition to the preceding features, the invention 
further comprises other features, which will emerge from the 
description which follows, which refers to examples of use 
of the polynucleotide representing the genome of the SARS- 
CoV strain derived from the sample recorded under the 
number 031 589, and derived cDNA fragments which are the 
subject of the present invention, and to Table I presenting the 
sequence listing: 

TABLE I 




genome of the — 

from the sample 
031589 

ORF-S* 21406-25348 



number 



SEQ ID NO: 5 
SEQ ID NO: 6 
SEQ ID NO: 7 
SEQ ID NO: 8 
SEQ ID NO: 9 
SEQ ID NO: 10 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 
SEQ ID NO: 14 
SEQ ID NO: 15 
SEQ ID NO: 16 
SEQ ID NO: 17 
SEQ ID NO: 18 

SEQ ID NO: 20 

SEQ ID NO: 22 
SEQ ID NO: 23 
SEQ ID NO: 24 



SEQ ID 



y. 25 



ORF-S" 



21406-25348 



SEQ ID NO: 2. 
SEQ ID NO: 27 
SEQ ID NO: 28 
SEQ ID NO: 29 
SEQ ID NO: 30 
SEQ ID NO: 31 
SEQ ID NO: 32 
SEQ ID NO: 33 
SEQ ID NO: 34 
SEQ ID NO: 35 
SEQ ID NO: 36 
SEQ ID NO: 37 
SEQ ID NO: 38 
SEQ ID NO: 39 
SEQ ID NO: 40 
SEQ ID NO: 41 

SEQ ID NO: 42 
SEQ ID NO: 43 
SEQ ID NO: 44 
SEQ ID NO: 45 
SEQ ID NO: 46 
SEQ ID NO: 47 
SEQ ID NO: 48 
SEQ ID NO: 49 
SEQ ID NO: 50 
SEQ ID NO: 51 
SEQ ID NO: 52 
SEQ ID NO: 53 

SEQ ID NO: 55 
SEQ ID NO: 56 

SEQ ID NO: 57 
SEQ ID NO: 58 
SEQ ID NO: 59 

SEQ ID NO: 60 

SEQ ID NO: 61 

SEQ ID NO: 62 

SEQ ID NO: 63 



Sa fragment 

ORFJ 6 ? ORF-4* 
ORF-3 + ORF-4" 
ORF3 

ORF-3 protein 
ORF4 

ORF-4 protein 
ORF-E* 



ORF7 

ORF7pr 

ORF8 



ORF13 

ORF13 protein 
ORF14 

ORE 14 protein 



26082-26413 
26082-26413 



26330-27098 1-3047 
26977-28218 — 
26977-28218 1-3125 



211-2260 
2136-4187 
3892-5344 
4932-6043 



28507-28522 
28774-28759 
28375-28390 
28702-28687 
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28561-28586 
28588-28608 
28541-28563 
28565-28589 



appended drawings in which: 

[0242] FIG. 1 illustrates Western-blot analysis of the 
expression in vitro of the recombinant proteins N, S c and S L 
from the expression vectors pIVEX. Lane 1 : pIV2.3N. Lane 
2: pIV2.3S c . Lane 3: pIV2.3S L . Lane 4: pIV2.4N. Lane 5: 
PIV2.4S! orpIV2.4S c . Lane 6: pIV2.4S L . The expression of 
the GFP protein expressed from the same vector is used as 
a control. 

[0243] FIG. 2 illustrates the analysis, by polyacrylamide 
gel electrophoresis under denaturing conditions (SDS- 
PAGE) and staining with Coomassie blue, of the expression 
in vivo of the N protein from the expression vectors pIVEX. 
The E. coli BL21(DE3)pDIA17 strain transformed with the 
recombinant vectors pIVEX is cultured at 30° C. in LB 
medium, in the presence or in the absence of inducer (IPTG 
1 mM). Lane 1: pIV2.3N. Lane 2: pIV2.4N. 

[0244] FIG. 3 illustrates the analysis, by polyacrylamide 
gel electrophoresis under denaturing conditions (SDS- 
PAGE) and staining with Coomassie blue, of the expression 
in vivo of the S L and S c polypeptides from the expression 
vectors pIVEX. The E. coli BL2 1 (DE3)pDI Al 7 strain trans- 
formed with the recombinant vectors pIVEX is cultured at 
30° C. in LB medium, in the presence or in the absence of 
inducer (IPTG 1 mM). Lane 1 : pIV2.3S c . Lane 2: pi V2.3S L . 
Lane 3: pIV2.4S,. Lane 4: pIV2.4S L . 

[0245] FIG. 4 illustrates the antigenic activity of the 
recombinant N, S L and S c proteins produced in the E. coli 
BL21(DE3)pDIA17 strain transformed with the recombi- 
nant vectors pIVEX. A: electrophoresis (SDS-PAGE) of the 
bacterial lysates. B and C: Western-blot with the sera, 
obtained from the same patient infected with SARS-CoV, 
collected 8 days (B: serum Ml 2) and 29 days (C: serum 
Ml 3) respectively after the onset of the SARS symptoms. 



Lane 1 : pIV2.3N. Lane 2: pIV2.4N. Lane 3: pIV2.3S c . Lane 
4: pIV2.4S!. Lane 5: pIV2.3S L . Lane 6: pIV2.4S L . 
[0246] FIG. 5 illustrates the purification on an Ni-NTA 
agarose column of the recombinant N protein produced in 
the E. coli BL21(DE3)pDIA17 strain from the vector 
pIV2.3N. Lane 1: total bacterial extract. Lane 2: soluble 
extract. Lane 3: insoluble extract. Lane 4: extract deposited 
on the Ni-NTA column. Lane 5: unbound proteins. Lane 6: 
fractions of peak 1. Lane 7: fractions of peak 2. 
[0247] FIG. 6 illustrates the purification of the recombi- 
nant S c protein from the inclusion bodies produced in the E. 
coli BL21(DE3)pDIA17 strain transformed with pIV2.4S!. 
A. Treatment with Triton X-100 (2%): Lane 1 : total bacterial 
extract. Lane 2: soluble extract. Lane 3: insoluble extract. 
Lane 4: supernatant after treatment with Triton X-100 (2%). 
Lanes 5 and 6: pellet after treatment with Triton X-100 (2%). 
B: Treatment with 4 M, 5 M, 6 M and 7 M urea of the soluble 
and insoluble extracts. 

[0248] FIG. 7 represents the immunoblot produced with 
the aid of a lysate of cells infected with SARS-CoV and a 
serum from a patient suffering from atypical pneumopathy. 
[0249] FIG. 8 represents immunoblots produced with the 
aid of a lysate of cells infected with SARS-CoV and rabbit 
immunosera specific for the nucleoprotein N (A) and for the 
spicule protein S (B). I.S.: immune serum, p.i.: preimmune 
serum. The anti-N immune serum was used at Vso 000 and 
the anti-S immune serum at Vio ooo. 
[0250] FIG. 9 illustrates the ELISA reactivity of the rabbit 
monospecific polyclonal sera directed against the N protein 
or the short fragment of the S protein (S c ), toward the 
corresponding recombinant proteins used for immunization. 
A: rabbits P13097, P13081 and P13031 immunized with the 
purified recombinant N protein. B: rabbits P11135, P13042 
and P14001 immunized with a preparation of inclusion 
bodies corresponding to the short fragment of the S protein 
(S c ). I.S.: immune serum, p.i.: preimmune serum. 
[0251] FIG. 10 illustrates the ELISA reactivity of the 
purified recombinant N protein, toward sera from patients 
suffering from atypical pneumonia caused by SARS-CoV. 
FIG. 10a: ELISA plates prepared with the N protein at the 
concentration of 4 ug/ml and 2 ug/ml. FIG. 10B: ELISA 
plate prepared with the N protein at the concentration of 1 
Ug/ml. The sera designated A, B, D, E, F, G, H correspond 
to those of Table IV. 

[0252] FIG. 11 illustrates the amplification by RT-PCR of 
decreasing quantities of synthetic RNA of the SARS-CoV N 
gene (10 7 to 1 copy), with the aid of pairs of primers No. 1 
(N/+/28507, N/-/28774) (A) and No. 2 (N/+/28375, N/-/ 
28702) (B). T: amplification performed in the absence of 
RNA. MW: DNA marker. 

[0253] FIG. 12 illustrates the amplification by RT-PCR in 
real time of synthetic RNA for the SARS-CoV N gene: 
decreasing quantities of synthetic RNA as replica (repli.; 
lanes 16 to 29) and of viral RNA diluted VioxVr 4 (lane 32) 
were amplified by RT-PCR in real time with the aid of the 
kit "Light Cycler RNA Amplification Kit Hybridization 
Probes" and pairs of primers and probes of the No. 2 series, 
under the conditions described in Example 8. 
[0254] FIG. 13 (FIGS. 13.1 to 13.7) represents the restric- 
tion map of the sequence SEQ ID NO: 1 corresponding to 
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the DNA equivalent of the genome of the SARS-CoV strain 
derived from the sample recorded under the number 03 1 589. 

[0255] FIG. 14 shows the result of the SARS serology test 
by indirect N ELISA (1 st series of sera tested). 

[0256] FIG. 15 shows the result of the SARS serology test 
by indirect N ELISA (2 nd series of sera tested). 

[0257] FIG. 16 presents the result of the SARS serology 
test by double epitope N ELISA (1 st series of sera tested). 

[0258] FIG. 17 shows the result of the SARS serology test 
by double epitope N ELISA (2 nd series of sera tested). 

[0259] FIG. 18 illustrates the test of reactivity of the 
anti-N monoclonal antibodies by ELISA on the native 
nucleoprotein N of SARS-CoV. The antibodies were tested 
in the form of hybridoma culture supernatants by indirect 
ELISA using an irradiated lysate of VeroE6 cells infected 
with SARS-CoV as antigen (SARS lysate curves). A nega- 
tive control for reactivity is performed for each antibody on 
a lysate of uninfected VeroE6 cells (negative lysate curves). 
Several monoclonal antibodies of known specificity were 
used as negative control antibodies: para 1-3 directed against 
the antigens of the parainfluenza viruses type 1-3 (Bio-Rad) 
and influenza B directed against the antigens of the influenza 
virus type B (Bio-Rad). 

[0260] FIG. 19 illustrates the test of reactivity of the 
anti-N of SARS-CoV monoclonal antibodies by ELISA on 
the native antigens of the human coronavirus 229E (HCoV- 
229E). The antibodies were tested in the form of hybridoma 
culture supernatants by an indirect ELISA test using a lysate 
of MRC-5 cells infected with the human coronavirus 229E 
as antigen (229E lysate curves). A negative control for 
immunoreactivity was performed for each antibody on a 
lysate of noninfected MRC-5 cells (negative lysate curves). 
The monoclonal antibody 5-1 1H. 6 directed against the S 
protein of the human coronavirus 229E (Sizun et al. 1 998, J. 
Virol. Met. 72: 145-152) is used as positive control antibody. 
The antibodies para 1-3 directed against the antigens of the 
parainfluenza virus type 1-3 (Bio-Rad) and influenza B 
directed against the antigens of the influenza virus type B 
(Bio-Rad) were added to the panel of monoclonal antibodies 
tested. 

[0261] FIG. 20 shows a test of reactivity of the anti-N of 
SARS-CoV monoclonal antibodies by Western blotting on 
the denatured native nucleoprotein N of SARS-CoV. A 
lysate of VeroE6 cells infected with SARS-CoV was pre- 
pared in the loading buffer according to Laenimli and caused 
to migrate in a 12% SDS polyacrylamide gel and then the 
proteins were transferred onto PVDF membrane. The anti-N 
monoclonal antibodies tested were used for the immunoas- 
say at the concentration of 0.05 ug/ml. Hie visualization is 
carried out with anti-mouse IgG(H+L) antibodies coupled to 
peroxidase (NA931V, Amersham) and the ECL+ system. 
Two monoclonal antibodies were used as negative controls 
for reactivity: influenza B directed against the antigens of 
the influenza virus type B (Bio-Rad) and para 1-3 directed 
against the antigens of the parainfluenza virus type 1-3 
(Bio-Rad). 

[0262] FIG. 21 presents the plasmids for expression in 
mammalian cells of the SARS-CoV S protein. The cDNA 
for the SARS-CoV S was inserted between the BamHI and 
Xhol sites of the expression plasmid pcDNA3.1(+) (Clon- 



tech) in order to obtain the plasmid pcDNA-S and between 
the Nhel and Xhol sites of the expression plasmid pCI 
(Promega) in order to obtain the plasmid pCI-S. The WPRE 
and CTE sequences were inserted between each of the two 
plasmids pcDNA-S and pCI-S between the Xhol and Xbal 
sites in order to obtain the plasmids pcDNA-S-CTE, 
pcDNA-S-WPRE, pCI-S-CTE and pCI-S-WPRE, respec- 
tively. 

[0263] SP: signal peptide predicted (aa 1-13) with the 
software signalP v2.0 (Nielsen et al., 1997, Protein 
Engineering, 10:1-6) 

[0264] TM: transmembrane region predicted (aa 1196- 
1218) with the software TMHMM v2.0 (Sonnhammer 
et al., 1998, Proc. of Sixth Int. Conf. on Intelligent 
Systems for Molecular Biology, pp. 175-182, AAAI 
Press). It should be noted that the amino acids W1194 
and PI 195 are possibly part of the transmembrane 
region with the respective probabilities of 0.13 and 0.42 

[0265] P-CMV: cytomegalovirus immediate/early pro- 
moter. BGH pA: polyadenylation signal of the bovine 
growth hormone gene 

[0266] SV40 late pA: SV40 virus late polyadenylation 
signal 

[0267] SD/SA: splice donor and acceptor sites 

[0268] WPRE: sequences of the " Woodchuck Hepatitis 
Virus posttranscriptional regulatory element" of the 
woodchuck hepatitis vims 

[0269] CTE: sequences of the "constitutive transport 
element" of the Mason-Pfizer simian retrovirus 

[0270] FIG. 22 illustrates the expression of the S protein 
after transfection of VeroE6 cells. Cellular extracts were 
prepared 48 hours after transfection of VeroE6 cells with the 
plasmids pcDNA, pcDNA-S, pCI and pCI-S. Cellular 
extracts were also prepared 1 8 hours after infection with the 
recombinant vaccinia virus W-TF7.3 and transfection with 
the plasmids pcDNA or pcDNA-S. As a control, extracts of 
VeroE6 cells were prepared 8 hours after infection with 
SARS-CoV at a multiplicity of infection of 3. They were 
separated on an 8% SDS acrylamide gel and analyzed by 
Western blotting with the aid of an anti-S rabbit polyclonal 
antibody and an anti-rabbit IgG(H+L) polyclonal antibody 
coupled to peroxidase (NA934V, Amersham). A molecular 
mass ladder (kDa) is presented in the figure. 

[0271] SARS-CoV: extract of VeroE6 cells infected 
with SARS-CoV 

[0272] Mock: control extract of noninfected cells 

[0273] FIG. 23 illustrates the effect of the CTE and WPRE 
sequences on the expression of the S protein after transfec- 
tion of VeroE6 and 293T cells. Cellular extracts were 
prepared 48 hours after transfection of VeroE6 cells (A) or 
293T cells (B) with the plasmids pcDNA, pcDNA-S, 
pcDNA-S-CTE, pcDNA-S-WPRE, pCI-S, pCI-S-CTE and 
pCl-S-WPRE separated on 8% SDS polyacrylamide gel and 
analyzed by Western blotting with the aid of an anti-S rabbit 
polyclonal antibody and an anti-rabbit IgG(H+L) polyclonal 
antibody coupled to peroxidase (NA934V, Amersham). A 
molecular mass ladder (kDa) is presented in the figure. 
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[0274] SARS-CoV: extract of VeroE6 cells prepared 8 
hours after infection with SARS-CoV at a multiplicity 
of infection of 3. 

[0275] Mock: control extract of noninfected VeroE6 
cells 

[0276] FIG. 24 presents defective lentiviral vectors with 
central DNA flap for the expression of SARS-CoV S. The 
cDNA for the SARS-CoV S protein was cloned in the form 
of a BamHl-Xhol fragment into the plasmid pTR!PAU3- 
CMV containing a defective lentiviral vector TRIP with 
central DNA flap (Sirven et al., 2001, Mol. Ther, 3: 438- 
448) in order to obtain the plasmid pTRIP-S. The optimum 
expression cassettes consisting of the CMV virus immedi- 
ate/early promoter, a splice signal, cDNA for S and either of 
the posttranscriptional signals CTE or WPRE were substi- 
tuted for the cassette EFla-EGFP of the defective lentiviral 
expression vector with central DNA flap TRIPAU3-EFla 
(Sirven et al., 2001, Mol. Ther., 3: 438-448) in order to 
obtain the plasmids pTRIP-SD/SA-S-CTE and pTRIP-SD/ 



SA-S-WPRE. 


[0277] 


SP: signal peptide 


[0278] 


TM: transmembrane region 


[0279] 


P-CMV: cytomegalovirus immediate/early pro- 


[0280] 


P-EFla: EFla gene promoter 


[0281] 


SD/SA: splice donor and acceptor sites 


[0282] 


WPRE: sequences of the "Woodchuck Hepatitis 



Virus posttranscriptional regulatory element" of the 
woodchuck hepatitis virus 
[0283] CTE: sequences of the "constitutive transport 

element" of the Mason-Pfizer simian retrovirus 
[0284] LTR: long terminal repeat 

[0285] AU3: LTR deleted for the "promoter/enhancer" 
sequences 

[0286] cPPT: "polypurine tract cis-active sequence" 
[0287] CTS: "central termination sequence" 
[0288] FIG. 25 shows the Western-blot analysis of the 
expression of the SARS-CoV S by cell lines transduced with 
the lentiviral vectors TRIP-SD/SA-S-WPRE and TRIP-SD/ 
SA-S-CTE. Cellular extracts were prepared from established 
lines FrhK4-S-CTE and FrhK4-S-WPRE after transduction 
with the lentiviral vectors TRIP-SD/SA-S-CTE and TRIP- 
SD/SA-S-WPRE respectively. They were separated on an 
8% SDS acrylamide gel and analyzed by Western blotting 
with the aid of an anti-S rabbit polyclonal antibody and an 
anti-rabbit IgG(H+L) conjugate coupled to peroxidase. A 
molecular mass ladder (kDa) is presented in the figure. 
[0289] T-: control extract of FrhK-4 cells 

[0290] T+: extract of FrhK-4 cells prepared 24 hours 
after infection with SARS-CoV at a multiplicity of 
infection of 3. 

[0291] FIG. 26 relates to the analysis of the expression of 
Ssol polypeptide by cell lines transduced with the lentiviral 
vectors TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA-Ssol- 
CTE. The secretion of the Ssol polypeptide was determined 
in the supernatant of a series of cell clones isolated after 



transduction of FrhK-4 cells with the lentiviral vectors 
TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA-Ssol-CTE. 5 ul 
of supernatant, diluted Vi in loading buffer according to 
Laemmli, were analyzed by Western blotting, visualized 
with an anti-FLAG monoclonal antibody (M2, Sigma) and 
an anti-mouse IgG(H+L) conjugate coupled to peroxidase. 
T-: supernatant of the parental FRhK-4 line. T+: supernatant 
of BHK cells infected with a recombinant vaccinia virus 
expressing the Ssol polypeptide. The solid arrow indicates 
the Ssol polypeptide, while the empty arrow indicates a 
cross reaction with a protein of cellular origin. 
[0292] FIG. 27 shows the results relating to the analysis of 
the purified Ssol polypeptide 

[0293] A. 8, 2, 0.5 and 0.125 ng of recombinant Ssol 
polypeptide purified by anti-FLAG affinity chromatography 
and gel filtration (G75) were separated on 8% SDS poly- 
acrylamide gel. The Ssol polypeptide and variable quantities 
of molecular mass markers (MM) were visualized by stain- 
ing with silver nitrate (Gelcode SilverSNAP stain kit II, 
Pierce). 

B. Standard markers for analysis by SELDI-TOF mass 
spectrometry 

[0294] IgG: bovine IgG of MM 147300 

[0295] ConA: conalbumin of MM 77490 

[0296] HRP: horseradish peroxidase analyzed as a con- 
trol and of MM 43240 

C. Analysis by mass spectrometry (SELDI-TOF) of the 
recombinant Ssol polypeptide. 

[0297] The peaks A and B correspond to the single and 
double charged Ssol polypeptide. 

D. Sequencing of the N-tenninal end of the recombinant 
Ssol polypeptide. 5 Edman degradation cycles in liquid 
phase were carried out on an ABI494 sequencer (Applied 
Biosystems). 

[0298] FIG. 28 illustrates the influence of a splicing signal 
and of the CTE and WPRE sequences on the efficacy of the 
gene immunization with the aid of plasmid DNA encoding 
the SARS-CoV S 

A. Groups of 7 BALB/c mice were immunized twice at 4 
weeks' interval with the aid of 50 u.g of plasmid DNA of 
pCl, pcDNA-S, pCI-S, pcDNA-N and pCI-HA. 

B. Groups of 6 BALB/c mice were immunized twice at 4 
weeks' interval with the aid of 2 u.g, 10 ug or 50 ug of 
plasmid DNA of pCI, pCI-S, pCI-S-CTE and pCI-S-WPRE. 
[0299] The immune sera collected 3 weeks after the sec- 
ond immunization were analyzed by indirect ELISA using a 
lysate of VeroE6 cells infected with SARS-CoV as antigen. 
The anti-SARS-CoV antibody titers are calculated as the 
reciprocal of the dilution producing a specific OD of 0.5 
after visualization with an anti-mouse IgG polyclonal anti- 
body coupled to peroxidase (NA931V, Amersharn) and 
TMB (KPL). 

[0300] FIG. 29 shows the seroneutralization of the infec- 
tivity of SARS-CoV with the antibodies induced in mice 
after gene immunization with the aid of plasmid DNA 
encoding SARS-CoV S. Pools of immune sera collected 3 
weeks after the second immunization were prepared for each 
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of the groups of experiments described in FIG. 28 and 
evaluated for their capacity to seroneutralize the infectivity 
of 100 TCID50 of SARS-CoV on FRhK-4 cells. 4 points are 
produced for each of the 2-fold dilutions tested from V20. The 
seroneutralizing titer is calculated according to the Reed and 
Munsch method as the reciprocal of the dilution neutralizing 
the infectivity of 2 wells out of 4. 

A. Groups by BALB/c mice immunized twice at 4 weeks' 
interval with the aid of 50 ug of plasmid DNA of pCI, 
pcDNA-S, pCI-S, pcDNA-N and pCI-HA. □: preimmune 

B. Groups of BALB/c mice immunized twice at 4 weeks' 
interval with the aid of 2 ug, 10 ug or 50 ug of plasmid DNA 
of pCI, pCI-S, pCI-S-CTE and pCI-S-WPRE. 

[0301] FIG. 30 illustrates the immunoreactivity of the 
recombinant Ssol polypeptide toward sera from patients 
suffering from SARS. The reactivity of sera from patients 
was analyzed by indirect ELISA test against solid phases 
prepared with the aid of the purified recombinant Ssol 
polypeptide. The antibodies from patients reacting with the 
solid phase at a dilution of '/too are visualized with a human 
anti-IgG(H+L) polyclonal antibody coupled to peroxidase 
(Amersham NA933V) and TMB plus H202 (KPL). The sera 
of probable SARS cases are identified by a National Refer- 
ence Center for Influenza Virus* -1 1 1 lumber and by the 
initials of the patient and the number of days elapsed since 
the onset of symptoms, where appropriate. The TV sera are 
control sera from subjects which were collected in France 
before the SARS epidemic which occurred in 2003. 

[0302] FIG. 31 shows the induction of antibodies directed 
against SARS-CoV after immunization with the recombi- 
nant Ssol polypeptide. Two groups of 6 mice were immu- 
nized at 3 weeks' interval with 10 ug of recombinant Ssol 
polypeptide (Ssol group) adjuvanted with aluminum 
hydroxide or, as a control, of adjuvant alone (mock group). 
Three successive immunizations were performed and the 
immune sera were collected 3 weeks after each of the three 
immunizations 0S1, IS2, IS3). The immune sera were 
analyzed per pool for each of the 2 groups by indirect ELISA 
using a lysate of VeroE6 cells infected with SARS-CoV as 
antigen. The anti-SARS-CoV antibody titers are calculated 
as the reciprocal of the dilution producing a specific OD of 
0.5 after visualization with an anti-mouse IgG polyclonal 
antibody coupled to peroxidase (Amersham) and TMB 
(KPL). 

[0303] FIG. 32 presents the nucleotide alignment of the 
sequences of the synthetic gene 040530 with the sequence of 
the wild-type gene of the SARS-CoV isolate 03 1 589. 1-3059 
corresponds to nucleotides 21406-25348 of the SARS-CoV 
isolate 031589 deposited at the C.N.C.M. under the number 
1-3059 (SEQ ID NO: 4, plasmid pSARS-S)S-040530 is the 
sequence of the synthetic gene 040530. 
[0304] FIG. 33 illustrates the use of a synthetic gene for 
the expression of the SARS-CoV S. Cellular extracts pre- 
pared 48 hours after transfection of VeroE6 cells (A) or 293T 
cells (B) with the plasmids pCI, pCI-S, pCI-S-CTE, pCI-S- 
WPRE and pCI-Ssynth were separated on 8% SDS acryla- 
mide gel and analyzed by Western blotting with the aid of an 
anti-S rabbit polyclonal antibody and an anti-rabbit IgG(H+ 
L) polyclonal antibody coupled to peroxidase (NA934V, 
Amersham). The Western blot is visualized by luminescence 



(ECL+, Amersham) and acquisition on a digital imaging 
device (Fluor S, BioRad). The levels of expression of the S 
protein were measured by quantifying the 2 predominant 
bands identified on the image. 

[0305] FIG. 34 presents a diagram for the construction of 
recombinant vaccinia viruses W-TG-S, W-TG-Ssol, W- 
TN-S and W-TN-Ssol 

A. The cDNAs for the S protein and the Ssol polypeptide of 
SARS-CoV were inserted between the BamHl and Smal 
sites of the transfer plasmid pTG186 in order to obtain the 
plasmids pTG-S and pTG-Ssol. 

[0306] B. The sequences of the synthetic promoter 480 
were then substituted for those of the 7.5 promoter by 
exchange of the Ndel-Pstl fragments of the plasmids 
pTG186poly, pTG-S and pTG-Ssol in order to obtain the 
transfer plasmids pTN480, pTN-S and pTN-Ssol. 
[0307] C. Sequence of the synthetic promoter 480 as 
contained between the Ndel and Pstl sites of the transfer 
plasmids of the pTN series. An Ascl site was inserted in 
order to facilitate subsequent handling. The restriction sites 
and the promoter sequence are underlined. 

D. The recombinant vaccinia viruses are obtained by double 
homologous recombination in vivo between the TK cassette 
of the transfer plasmids of the pTG and pTN series and the 
TK gene of the Copenhagen strain of the vaccinia virus. 

[0308] SP: signal peptide predicted (aa 1-13) with the 
software signalP v2.0 (Nielsen et al., 1997, Protein 
Engineering, 10:1-6) 
[0309] TM: transmembrane region predicted (aa 1196- 
1218) with the software TMHMM v2.0 (Sonnhammer 
et al., 1998, Proc. of Sixth Int. Conf. on Intelligent 
Systems for Molecular Biology, pp. 175-182, AAAI 
Press). It should be noted that the amino acids W1194 
and PI 195 possibly form part of the transmembrane 
region with respective probabilities of 0.13 and 0.42. 
[0310] TK-L, TK-R: left- and right-hand parts of the 

vaccinia virus thymidine kinase gene 
[0311] MCS: multiple cloning site 
[0312] PE: early promoter 
[0313] PL: late promoter 
[0314] PL synth: synthetic late promoter 480 
[0315] FIG. 35 illustrates the expression of the S protein 
by recombinant vaccinia viruses, analyzed by Western blot- 
ting. Cellular extracts were prepared 18 hours after infection 
of CV1 cells with the recombinant vaccinia viruses W-TG, 
W-TG-S and W-TN-S at an M.O.I, of 2 (A). As a control, 
extracts of VeroE6 cells were prepared 8 hours after infec- 
tion with SARS-CoV at a multiplicity of infection of 2. 
Cellular extracts were also prepared 18 hours after infection 
of CV1 cells with the recombinant vaccinia viruses W-TG- 
S, W-TG-Ssol, W-TN, W-TN-S and W-TN-Ssol (B). 
They were separated on 8% SDS acrylamide gels and 
analyzed by Western blotting with the aid of an anti-S rabbit 
polyclonal antibody and an anti-rabbit IgG(H+L) polyclonal 
antibody coupled to peroxidase (NA934V, Amersham). "1 
ul" and "10 ul" indicates the quantities of cellular extracts 
deposited on the gel. A molecular mass ladder (kDa) is 
presented in the figure. 
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[0316] SARS-CoV: extract of VeroE6 cells infected 
with SARS-CoV 

[0317] Mock: control extract of noninfected cells 

[0318] FIG. 36 shows the result of a Western-blot analysis 
of the secretion of the Ssol polypeptide by the recombinant 
vaccinia viruses. 

A. Superaatants of CV1 cells infected with the recombinant 
vaccinia virus W-TN, various clones of the W-TN-Ssol 
virus and with the viruses W-TG-Ssol or W-TN-Sflag 
were harvested 18 hours after infection of CV1 cells at an 
M.O.I. of 2. 

[0319] B. Supernatants of 293T, FRhK-4, BHK-21 and 
CV1 cells infected in duplicate (1.2) with the recombinant 
vaccinia virus W-TN-Ssol at an M.O.I, of 2 were harvested 
18 hours after infection. The supernatant of CVI cells 
infected with the virus W-TN was also harvested as a 
control (M). 

[0320] All the supernatants were separated on 8% SDS 
acrylamide gel according to Laemmli and analyzed by 
Western blotting with the aid of an anti-FLAG mouse 
monoclonal antibody and an anti-mouse IgG(H+L) poly- 
clonal antibody coupled to peroxidase (NA931V, Amer- 
sham) (A) or with the aid of an anti-S rabbit polyclonal 
antibody and an anti-rabbit IgG(H+L) polyclonal antibody 
coupled to peroxidase (NA934V, Amersharn) (B). 
[0321] A molecular mass ladder (kDa) is presented in the 
figure. 

[0322] FIG. 37 shows the analysis of the Ssol polypeptide, 
purified on SDS polyacrylamide gel 

[0323] 10, 5 and 211 of recombinant Ssol polypeptide 
purified by anti-FLAG affinity chromatography were sepa- 
rated on 4 to 15% gradient SDS polyacrylamide gel. The 
Ssol polypeptide and variable quantities of molecular mass 
markers (MM) were visualized by staining with silver nitrate 
(Gelcode SilverSNAP stain kit II, Pierce). 
[0324] FIG. 38 illustrates the immunoreactivity of the 
recombinant Ssol polypeptide produced by the recombinant 
vaccinia virus W-TN-Ssol toward sera of patients suffering 
from SARS. The reactivity of sera from patients was ana- 
lyzed by indirect ELISA test against solid phases prepared 
with the aid of the purified recombinant Ssol polypeptide. 
The antibodies from patients reacting with the solid phase at 
a dilution of Vioo and Vioo are visualized with a human 
anti-IgG(H+L) polyclonal antibody coupled to peroxidase 
(Amersharn NA933V) and TMB plus H202 (KPL). The sera 
of probable SARS cases are identified by a National Refer- 
ence Center for Influenza Virus serial number and by the 
initials of the patient and the number of days elapsed since 
the onset of symptoms, where appropriate. The TV sera are 
control sera from subjects which were collected in France 
before the SARS epidemic which occurred in 2003. 
[0325] FIG. 39 shows the anti-SARS-CoV antibody 
response in mice after immunization with the recombinant 
vaccinia viruses. Groups of 7 BALB/c mice were immu- 
nized by the i.v. route twice at 4 weeks' interval with 106 pfu 
of recombinant vaccinia viruses W-TG, W-TG-HA, W- 
TG-S, W-TG-Ssol, W-TN, W-TN-S, W-TN-Ssol. 
[0326] A. Pools of immune sera collected 3 weeks after 
each of the two immunizations were prepared for each of the 



groups and were analyzed by indirect ELISA using a lysate 
of VeroE6 cells infected with SARS-CoV as antigen. The 
anti-SARS-CoV antibody titers are calculated as the recip- 
rocal of the dilution producing a specific OD of 0.5 after 
visualization with an anti-mouse IgG polyclonal antibody 
coupled to peroxidase (NA931V, Amersharn) and TMB 
(KPL). 

[0327] B. The pools of immune sera were evaluated for 
their capacity to seroneutralize the infectivity of 100 
TCID50 of SARS-CoV on FRhK-4 cells. 4 points are 
produced for each of the 2-fold dilutions tested from V20. The 
seroneutralizing titer is calculated according to the Reed and 
Munsch method as the reciprocal of the dilution neutralizing 
the infectivity of 2 wells out of 4. 

[0328] FIG. 40 describes the construction of the recom- 
binant viruses MVSchw2-SARS-S and MVSchw2-SARS- 
Ssol. 

[0329] A. The measles vector is a complete genome of the 
Schwarz vaccine strain of the measles virus (MV) into 
which an additional transcription unit has been introduced 
(Combredet, 2003, Journal of Virology, 77: 11546-11554). 
The expression of the additional open reading frames (ORF) 
is controlled by cis-acting elements necessary for the tran- 
scription, for the formation of the cap and for the polyade- 
nylation of the transgene which were copied from the 
elements present at the N/P junction. 2 different vectors 
allow the insertion between the P (phosphoprotein) and M 
(matrix) genes on the one hand and the H (hemagglutinin) 
and L (polymerase) genes on the other hand. 
[0330] B. The recombinant genomes MVSchw2-SARS-S 
and MVSchw2-SARS-Ssol of the measles virus were con- 
structed by inserting the ORFs of the S protein and of the 
Ssol polypeptide into an additional transcription unit located 
between the P and M genes of the vector. 
[0331] The various genes of the measles virus (MV) are 
indicated: N (nucleoprotein), PVC (V/C phosphoprotein and 
protein), M (matrix), F (fusion), H (hemagglutinin), L (poly- 
merase). T7=T7 RNA polymerase promoter, hh=hammer- 
head ribozyme, T7t=T7 phage RNA polymerase terminator 
sequence, 6=ribozyme of the hepatitis 8 virus, (2), (3)= 
additional transcription units (ATU). 

[0332] Size of the MV genome: 15 894 nt. 

[0333] SP: signal peptide 

[0334] TM: transmembrane region 

[0335] FLAG: FLAG tag 

[0336] FIG. 41 illustrates the expression of the S protein 
by the recombinant measles viruses, analyzed by Western 
blotting. 

[0337] Cytoplasmic extracts were prepared after infection 
of Vera cells by different passages of the viruses MVSchw2- 
SARS-S and MVSchw2-SARS-Ssol and the wild-type virus 
MWSchw as control. Cellular extracts in loading buffer 
according to Laemmli were also prepared 8 hours after 
infection of VeroE6 cells with SARS-CoV at a multiplicity 
of infection of 3. They were separated on 8% SDS acryla- 
mide gel and analyzed by Western blotting with the aid of an 
anti-S rabbit polyclonal antibody and an anti-rabbit IgG(H+ 
L) polyclonal antibody coupled to peroxidase (NA934V, 
Amersharn). 
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[0338] A molecular mass ladder (kDa) is presented in the 



t of VeroE6 cells infected 



[0342] FIG. 42 shows the expression of the S protein by 
the recombinant measles viruses, analyzed by lmmunorluo- 



[0343] Vero cells in monolayers on glass slides were 
infected with the wild-type virus MWSchw (A) or the 
viruses MVSchw2-SARS-S (B) and MVSchw2-SARS-Ssol 
(C). When the syncytia have reached 30 to 40% confluence 
(A., B.) or 90-100% (C), the cells were fixed, permeabilized 
and labeled with anti-SARS-CoV rabbit polyclonal antibod- 
ies and an anti-rabbit IgG(H+L) conjugate coupled to FITC 
(Jackson). 

[0344] FIG. 43 illustrates the Western-blot analysis of the 
immunoreactivity of rabbit sera directed against the peptides 
El -12, E53-76 and M2-14. The rabbit 20047 was immu- 
nized with the peptide El-12 coupled to KLH. The rabbits 
22234 and 22240 were immunized with the peptide E53-76 
coupled to KLH. The rabbits 20013 and 20080 were immu- 
nized with the peptide M2-14 coupled to KLH. The immune 
sera were analyzed by Western blotting with the aid of 
extracts of cells infected with SARS-CoV (B) or with the aid 
of extracts of cells infected with a recombinant vaccinia 
virus expressing the protein E (A) or M (C) of the SARS- 
CoV 031589 isolate. The immunoblots were visualized with 
the aid of an anti-rabbit IgG(H+L) conjugate coupled to 
peroxidase (NA934V, Amersham). 

[0345] The position of the E and M proteins is indicated by 



[0346] A molecular mass ladder (kDa) i 
figure. 

[0347] It should be understood, however, that these 
examples are given solely by way of illustration of the 
subject of the invention, and do not constitute in any manner 
a limitation thereto. 

EXAMPLE 1 

Cloning and Sequencing of the Genome of the 
SARS-CoV Strain Derived from the Sample 
Recorded Under the Number 031589 

[0348] The RNA of the SARS-CoV strain was extracted 
from the sample of bronchoalveolar washing recorded under 
the number 031589, performed on a patient at the Hanoi 
(Vietnam) French hospital suffering from SARS. 

[0349] The isolated RNA was used as template to amplify 
the cDNAs corresponding to the various open reading 
frames of the genome (ORFla, ORFlb, ORF-S. ORF-E, 
ORF-M, ORF-N (including ORF-13 and ORF-14), ORF3, 
ORF4, ORF7 to ORF11), and at the noncoding 5' and 3' 
ends. The sequences of the primers and of the probes used 
for the amplification/detection were defined based on the 
available SARS-CoV nucleotide sequence. 



[0350] In the text which follows, the primers and the 
probes are identified by: the letter S, followed by a letter 
which indicates the corresponding region of the genome (L 
for the 5' end including ORFla and ORFlb; S, M and N for 
ORF-S, ORF-M, ORF-N, SE and MN for the corresponding 
intergene regions), and then optionally by Fn, Rn, with n 
between 1 and 6 corresponding to the primers used for the 
nested PCR (Fl+Rl pair for the first amplification, F2+R2 
pair for the second amplication, and the like), and then by 
/+/or /-/ corresponding to a sense or antisense primer and 
finally by the positions of the primers with reference to the 
Genbank sequence AY27411.3; for the sense and antisense 
S and N primers and the other sense primers only, when a 
single position is indicated, it corresponds to that of the 5' 
end of a probe or of a primer of about 20 bases; for the 
antisense primers other than the S and N primers, when a 
single position is indicated, it corresponds to that of the 3' 
end of a probe or of a primer of about 20 bases. 
[0351] The amplification products thus generated were 
sequenced with the aid of specific primers in order to 
determine the complete sequence of the genome of the 
SARS-CoV strain derived from the sample recorded under 
the number 031589. These amplification products, with the 
exception of those corresponding to ORFla and ORFlb, 
were then cloned into expression vectors in order to produce 
the corresponding viral proteins and the antibodies directed 
against these proteins, in particular by DNA-based immu- 

1 . Extraction of the RNAs 

[0352] The RNAs were extracted with the aid of the 
QIamp viral RNA extraction mini kit (QIAGEN) according 
to the manufacturer's recommendations. More specifically: 
14011 of the sample and 560 ul of AVL buffer were 
vigorously mixed for 15 seconds, incubated for 10 minutes 
at room temperature and then briefly centrifuged at maxi- 
mum speed. 560 ul of 100% ethanol were added to the 
supernatant and the mixture thus obtained was very vigor- 
ously stirred for 15 sec. 630 ul of the mixture were then 
deposited on the column. 

[0353] The column was placed on a 2 ml tube, centrifuged 
for 1 min at 8000 rpm, and then the remainder of the 
preceding mixture was deposited on the same column, 
centrifuged again, for 1 min at 8000 rpm, and the column 
was transferred over a clean 2 ml tube. Next, 500 ul of AW1 
buffer were added to the column, and then the column was 
centrifuged for 1 min at 8000 rpm and the eluate was 
discarded. 500 ul of AW2 buffer were added to the column 
which was then centrifuged for 3 min at 14 000 rpm and 
transferred onto a 1.5 ml tube. Finally, 60 ul of AVE buffer 
were added to the column which was incubated for 1 to 2 
min at room temperature and then centrifuged for 1 min at 
8000 rpm. The eluate corresponding to the purified RNA 
was recovered and frozen at -20° C. 

2. Amplification, Sequencing and Cloning of the cDNAs 
2.1) cDNA Encoding the S Protein 

[0354] The RNAs extracted from the sample were sub- 
jected to reverse transcription with the aid of random 
sequence hexameric oligonucleotides (pdN6), so as to pro- 
duce cDNA fragments. 

[0355] The sequence encoding the SARS-CoV S glyco- 
protein was amplified in the form of two overlapping DNA 
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fragments: 5' fragment (SARS-Sa, SEQ ID NO: 5) and 3" 
fragment (SARS-Sb, SEQ ID NO: 6), by carrying out two 
successive amplifications with the aid of nested primers. The 
amplicons thus obtained were sequenced, cloned into the 
PCR plasmid vector 2.1-TOPO™ (INVTTROGEN), and 
then the sequence of the cloned cDNAs was determined. 

a) Cloning and Sequencing of the Sa and Sb Fragments 
a.l) Synthesis of the cDNA 

[0356] The reaction mixture containing: RNA (5 ul), H 2 0 
for injection (3.5 u.1), 5x reverse transcriptase buffer (4 al), 
5 mM dNTP (2 ul), pdN6 100 ug/ml (4 ul), RNasin 40 IU/u.1 
(0.5 ul) and reverse transcriptase AMV-RT, 10 IU/ul, 
PROMEGA (1 ul) was incubated in a thermocycler under 
the following conditions: 45 min at 42° C, 15 min at 55° C, 
5 min at 95° C, and then the cDNA obtained was kept at +4° 
C. 

a.2) First PCR Amplification 

[0357] The 5' and 3' ends of the S gene were respectively 
amplified with the pairs of primers S/F1/+/21350-21372 and 
S/R1/-/235 18-23498, S/F3/+/23258-23277 and S/R3/-/ 
25382-25363. The 50 ul reaction mixture containing: cDNA 
(2 ul), 50 uM primers (0.5 ul), lOx buffer (5 ul), 5 mM dNTP 
(2 ul), Taq Expand High Fidelity, Roche (0.75 ul) and H 2 0 
(39, 75 ul) was amplified in a thermocycler, under the 
following conditions: an initial step of denaturation at 94° C. 
for 2 min was followed by 40 cycles comprising: a step of 
denaturation at 94° C. for 30 sec, a step of annealing at 55° 
C. for 30 sec and then a step of extension at 72° C. for 2 min 
30 sec, with 10 sec of additional extension at each cycle, and 
then a final step of extension at 72° C. for 5 min. 
a.3) Second PCR Amplification 

[0358] The products of the first PCR amplification (5' and 
3' amplicons) were subjected to a second PCR amplification 
step (nested PCR) under conditions identical to those of the 
first amplification, with the pairs of primers S/F2/+/21406- 
21426 and S/R2/-/23454-23435 and S/F4/+/23322-23341 
and S/R4/-/25348-25329, respectively for the 5' amplicon 
and the 3' amplicon. 

a.4) Cloning and Sequencing of the Sa and Sb Fragments 
[0359] The Sa (5' end) and Sb (3' end) amplicons thus 
obtained were purified with the aid of the QIAquick PCR 
purification kit (QIAGEN), following the manufacturer's 
instructions, and then they were cloned into the vector 
PCR2.1-TOPO (Invitrogen kit), to give the plasmids called 
SARS-S1 and SARS-S2. 

[0360] The DNA of the Sa and Sb clones was isolated and 
then the corresponding insert was sequenced with the aid of 
the Big Dye kit, Applied Biosystem® and universal primers 
M13 forward and M13 reverse, and primers: S/S/+/21867, 
S/S/+/22353. S/S/+/22811, S/S/+/23754, S/S/+/24207, 
S/S/+/24699. S/S/+/24348, S/S/-/24209, S/S/-/23630, 
S/S/-/23038. S/S/-/22454, S/S/-/21815, S/S/-/24784, 
S/S/+/21556, S/S/+/23130 and S/S/+/24465 following the 
manufacturer's instructions; the sequences of the Sa and Sb 
fragments thus obtained correspond to the sequences SEQ 
ID NO: 5 and SEQ ID NO: 6 in the sequence listing 
appended as an annex. 

[0361] The plasmid, called SARS-S1, was deposited 
under the No. 1-3020, on May 12, 2003, at the Collection 



Nationale de Cultures de Microorganismes, 25 rue du Doc- 
teur Roux, 75724 Paris Cedex 15; it contains a 5' fragment 
of the sequence of the S gene of the SARS-CoV strain 
derived from the sample recorded under the No. 031589, as 
defined above, said fragment called Sa corresponding to the 
nucleotides at positions 21406 to 23454 (SEQ ID NO: 5), 
with reference to the Genbank sequence AY274 11 9.3 Tor2. 
[0362] The plasmid, called TOP10P-SARS-S2, was 
deposited under the No. 1-3019, on May 12, 2003, at the 
Collection Nationale de Cultures de Microorganismes, 25 
rue du Docteur Roux, 75724 Paris Cedex 1 5; it contains a 3' 
fragment of the sequence of the S gene of the SARS-CoV 
strain derived from the sample recorded under the No. 
031589, as defined above, said fragment called Sb corre- 
sponding to the nucleotides at positions 23322 to 25348 
(SEQ ID NO: 6), with reference to the Genbank sequence 
accession No. AY274119.3. 

b) Cloning and Sequencing of the Complete cDNA (SARS-S 
Clone of 4 kb) 

[0363] The complete S cDNA was obtained from the 
abovementioned clones SARS-S 1 and SARS-S2, in the 
following manner: 

[0364] 1) A PCR amplification reaction was carried out on 
a SARS-S2 clone in the presence of the above-mentioned 
primer S/R4/-/25348-25329 and of the primer S/S/+/24696- 
24715: an amplicon of 633 bp was obtained, 

[0365] 2) Another PCR amplification reaction was carried 
out on another SARS-S2 clone, in the presence of the 
primers S/F4/+/23322-23341 mentioned above and S/S/-/ 
24803-24784: an amplicon of 1481 bp was obtained. 
[0366] The amplification reaction was carried out under 
the conditions as defined above for the amplification of the 
Sa and Sb fragments, with the exception that 30 amplifica- 
tion cycles comprising a step of denaturation at 94° C. for 20 
sec and a step of extension at 72° C. for 2 min 30 sec were 
carried out. 

[0367] 3) The 2 amplicons (633 bp and 1481 bp) were 
purified under the conditions as defined above for the Sa and 
Sb fragments. 

[0368] 4) Another PCR amplification reaction with the aid 
of the abovementioned primers S/F4/+/23322-23341 and 
S/R4/-/25348-25329 was carried out on the purified ampli- 
cons obtained in 3). The amplification reaction was carried 
out under the conditions as defined above for the amplifi- 
cation of the Sa and Sb fragments, except that 30 amplifi- 
cation cycles were performed. 

[0369] The 2026 bp amplicon thus obtained was purified, 
cloned into the vector PCR2.1-TOPO and then sequenced as 
above, with the aid of the primers as defined above for the 
Sa and Sb fragments. The clone thus obtained was called 
clone 3'. 

[0370] 5) The clone SARS-S 1 obtained above and the 
clone 3' were digested with EcoR I, the bands of about 2 kb 
thus obtained were gel purified and then amplified by PCR 
with the abovementioned primers S/F2/+/2 1406-2 1426 and 
S/R4/-/25348-25329. The amplification reaction was car- 
ried out under the conditions as defined above for the 
amplification of the Sa and Sb fragments, except that 30 
amplification cycles were performed. The amplicon of about 
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4 kb was purified and sequenced. It was then cloned into the 
vector PCR2.1-TOPO in order to give the plasmid, called 
SARS-S, and the insert obtained in this plasmid was 
sequenced as above, with the aid of the primers as defined 
above for the Sa and Sb fragments. The cDNA sequences of 
the insert and of the amplicon encoding the S protein 
correspond respectively to the sequences SEQ ID NO: 4 and 
SEQ ID NO: 2 in the sequence listing appended as an annex, 
they encode the S protein (SEQ ID NO: 3). 

[0371] The sequence of the amplicon corresponding to the 
cDNA encoding the S protein of the SARS-CoV strain 
derived from the sample No. 031589 has the following two 
mutations compared with the corresponding sequences of 
respectively the Tor2 and Urbani isolates, the positions of 
the mutations being indicated with reference to the complete 
sequence of the genome of the Tor2 isolate (Genbank 
AY274119.3): 

[0372] g/t in position 23220; the alanine codon (get) in 
position 577 of the amino acid sequence of the S 
protein of Tor2 is replaced with a serine codon (tct), 

[0373] c/t in position 24872: this mutation does not 
modify the amino acid sequence of the S protein, and 

the plasmid, called SARS-S, was deposited under the No. 
1-3059, on Jun. 20, 2003, at the Collection Nationale de 
Cultures de Microorganismes, 25 rue du Docteur Roux, 
75724 Paris Cedex 15; it contains the cDNA sequence 
encoding the S protein of the SARS-CoV strain derived 
from the sample recorded under the No. 031589, said 
sequence corresponding to the nucleotides at positions 
21406 to 25348 (SEQ ID NO: 4), with reference to the 
Genbank sequence AY274119.3. 

2.2) cDNA Encoding the M and E Proteins 

[0374] The RNAs derived from the sample 031589, 
extracted as above, were subjected to a reverse transcription, 
combined, during the same step (Titan One Step RT-PCR® 
kit, Roche), with a PCR amplification reaction, with the aid 
of the pairs of primers: 

[0375] S/E/F1/+/26051-26070 and S/E/R1/-/26455- 
26436 in order to amplify ORF-E, and 

[0376] S/M/F1/+/26225-26244 and S/M/R1/-/27148- 
27129 in order to amplify ORF-M. 

[0377] A first reaction mixture containing: 8.6 ul of H 2 0 
for injection, 1 ul of dNTP (5 mM), 0.2 ul of each of the 
primers (50 uM), 1.25 ul of DTT (100 mM) and 0.25 ul of 
RNAsin (40 IU/ul) was combined with a second reaction 
mixture containing: 1 ul of RNA, 7 ul of H 2 0 for injection, 

5 ul of 5xRT-PCR buffer and 0.5 ul of enzyme mixture and 
the combined mixtures were incubated in a thermocycler 
under the following conditions: 30 min at 42° C, 10 min at 
55° C, 2 min at 94° C. followed by 40 cycles comprising a 
step of denaturation at 94° C. for 10 sec, a step of annealing 
at 55° C. for 30 sec and a step of extension at 68° C. for 45 
sec, with 3 sec increment per cycle and finally a step of 
terminal extension at 68° C. for 7 min. 

[0378] The amplification products thus obtained (M and E 
amplicons) were subjected to a second PCR amplification 
(nested PCR) using the Expand High-Fi® kit, Roche), with 
the aid of the pairs of primers: 



[0379] S/E/F2/+/26082-26101 and S/E/R2/-/26413- 
26394 for the amplicon E, and 

[0380] S/M/F2/+/26330-26350 and S/M/R2/-/27098- 
27078 for the amplicon M. 

[0381] The reaction mixture containing: 2 ul of the prod- 
uct of the first PCR, 39.25 ul of H 2 0 for injection, 5 ul of 
lOx buffer containing MgCl 2 , 2 ul of dNTP (5 mM), 0.5 ul 
of each of the primers (50 uM) and 0.75 ul of enzyme 
mixture was incubated in a thermocycler under the follow- 
ing conditions: a step of denaturation at 94° C. for 2 min was 
followed by 30 cycles comprising a step of denaturation at 
94° C. for 1 5 sec, a step of annealing at 60° C. for 30 sec and 
a step of extension at 72° C. for 45 sec, with 3 sec increment 
per cycle, and finally a step of terminal extension at 72° C. 
for 7 min. The amplification products obtained correspond- 
ing to the cDNAs encoding the E and M proteins were 
sequenced as above, with the aid of the primers: S/E/F2/+/ 
26082 and S/E/R2/-/26394, S/M/F2/+/26330, S/M/R2/-/ 
27078 cited above and the primers S/M/+/26636-26655 and 
S/M/-/26567-26548. They were then cloned, as above, in 
order to give the plasmids called SARS-E and SARS-M. The 
DNA of these clones was then isolated and sequenced with 
the aid of the universal primers M13 forward and M13 
reverse and the primers S/M/+/26636 and S/M/-/26548 
mentioned above. 

[0382] The sequence of the amplicon representing the 
cDNA encoding the E protein (SEQ ID NO: 13) of the 
SARS-CoV strain derived from the sample No. 031 589 does 
not contain differences in relation to the corresponding 
sequences of the isolates AY274119.3-Tor2 and AY278741- 
Urbani. The sequence of the E protein of the SARS-CoV 
031589 strain corresponds to the sequence SEQ ID NO: 14 
in the sequence listing appended as an annex. 
[0383] The plasmid, called SARS-E, was deposited under 
the No. 1-3046, on May 28, 2003, at the Collection Nationale 
de Cultures de Microorganismes, 25 rue du Docteur Roux, 
75724 Paris Cedex 15; it contains the cDNA sequence 
encoding the E protein of the SARS-CoV strain derived 
from the sample recorded under the No. 031589, as defined 
above, said sequence corresponding to the nucleotides at 
positions 26082 to 26413 (SEQ ID NO: 15), with reference 
to the Genbank sequence accession No. AY274119.3. 
[0384] The sequence of the amplicon representing the 
cDNA encoding M (SEQ ID NO: 16) from the SARS-CoV 
strain derived from the sample No. 031589 does not contain 
differences in relation to the corresponding sequence of the 
isolate AY274 1 1 9.3-Tor2. By contrast, at position 26857, the 
isolate AY278741 -Urbani contains a c and the sequence of 
the SARS-CoV strain derived from the sample recorded 
under the No. 031589 contains a t. This mutation results in 
a modification of the amino acid sequence of the corre- 
sponding protein: at position 154, a proline (AY278741- 
Urbani) is changed to serine in the SARS-CoV strain derived 
from the sample recorded under the No. 031589. The 
sequence of the M protein of the SARS-CoV strain derived 
from the sample recorded under the No. 03 1589 corresponds 
to the sequence SEQ ID NO: 17 in the sequence listing 
appended as an annex. 

[0385] The plasmid, called SARS-M, was deposited under 
the No. 1-3047, on May 28, 2003, at the Collection Nationale 
de Cultures de Microorganismes, 25 rue du Docteur Roux, 
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75724 Paris Cedex 15; it contains the cDNA sequence 
encoding the M protein of the SARS-CoV strain derived 
from the sample recorded under the No. 031589, as defined 
above; said sequence corresponding to the nucleotides at 
positions 26330 to 27098 (SEQ ID NO: 18), with reference 
to the Genbank sequence accession No. AY274119.3. 
2.3) cDNA Corresponding to ORF3, ORF4, ORF7 to ORF11 
[0386] The same amplification, cloning and sequencing 
strategy was used to obtain the cDNA fragments correspond- 
ing respectively to the following ORFs: ORF3, ORF4, 
ORF7, ORF8, ORF9, ORF10 and ORF11. The pairs of 
primers used for the first amplification are: 

[0387] ORF3 and ORF4: S/SE/F1/+/25069-25088 and 
S/SE/R1/-/26300-26281 

[0388] ORF7 to ORF11: S/MN/F1/+/26898-26917 and 
S/MN/R1/-/28287-28266 

[0389] The pairs of primers used for the second amplifi- 
cation are: 

[0390] ORF3 and ORF4: S/SE/F2/+/25110-25129 and 
S/SE/R2/-/26244-26225 

[0391] ORF7 to ORF 1 1 : S/MN/F2/+/26977-26996 and 
S/MN/R2/-/282 18-28 199 

[0392] The conditions for the first amplification (RT-PCR) 
are the following: 45 min at 42° C, 10 min at 55° C, 2 min 
at 94° C. followed by 40 cycles comprising a step of 
denaturation at 94° C. for 15 sec, a step of annealing at 58° 
C. for 30 sec and a step of extension at 68° C. for 1 min, with 
5 sec increment per cycle and finally a step of terminal 
extension at 68° C. for 7 min. 

[0393] The conditions for the nested PCR are the follow- 
ing: a step of denaturation at 94° C. for 2 min was followed 
by 40 cycles comprising a step of denaturation at 94° C. for 
20 sec. a step of annealing at 58° C. for 30 sec and a step of 
extension at 72° C. for 50 sec, with 4 sec increment per cycle 
and finally a step of terminal extension at 72° C. for 7 min. 
[0394] The amplification products obtained corresponding 
to the cDNAs containing respectively ORF3 and 4 and 
ORF7 to 11 were sequenced with the aid of the primers: 
S/SE/+/25363, S/SE/+/25835, S/SE/-/25494, S/SE/-/ 
25875, S/MN/+/27839, S/MN/+/27409, S/MN/-/27836, 
S/MN/-/27799 and cloned as above for the other ORFs, to 
give the plasmids called SARS-SE and SARS-MN. The 
DNA of these clones was isolated and sequenced with the 
aid of these same primers and of the universal primers Ml 3 
sense and M13 antisense. 

[0395] The sequence of the amplicon representing the 
cDNA of the region containing OFR3 and ORF4 (SEQ ID 
NO: 7) of the SARS-CoV strain derived from the sample No. 
031589 contains a nucleotide difference in relation to the 
corresponding sequence of the isolate AY274119-Tor2. This 
mutation at position 25298 results in a modification of the 
amino acid sequence of the corresponding protein (ORF3): 
at position 11, an arginine (AY274119-Tor2) is changed to 
glycine in the SARS-CoV strain derived from the sample 
No. 031589. By contrast, no mutation was identified in 
relation to the corresponding sequence of the isolate 
AY278741-Urbani. The sequences of ORF3 and 4 of the 
SARS-CoV strain derived from the sample No. 031589 



correspond respectively to the sequences SEQ ID NO: 10 
and 12 in the sequence listing appended as an annex. 

[0396] The plasmid, called SARS-SE, was deposited 
under the No. 1-3126, on Nov. 13, 2003, at the Collection 
Nationale de Cultures de Microorganismes, 25 rue du Doc- 
teur Roux, 75724 Paris Cedex 15; it contains the cDNA 
corresponding to the region situated between ORF-S and 
ORF-E and overlapping ORF-E of the SARS-CoV strain 
derived from the sample recorded under the No. 031589, as 
defined above, said region corresponding to the nucleotides 
at positions 25110 to 26244 (SEQ ID NO: 8), with reference 
to the Genbank sequence accession No. AY274119.3. 
[0397] The sequence of the amplicon representing the 
cDNA corresponding to the region containing ORF7 to 
ORF11 (SEQ ID NO: 19) of the SARS-CoV strain derived 
from the sample No. 031589 does not contain differences in 
relation to the corresponding sequences of the isolates 
AY274119-Tor2 and AY278741-Urbani. The sequences of 
ORF7 to 11 of the SARS-CoV strain derived from the 
sample No. 031589 correspond respectively to the 
sequences SEQ ID NO: 22, 24, 26, 28 and 30 in the sequence 
listing appended as an annex. 

[0398] The plasmid, called SARS-MN, was deposited 
under the No. 1-3125, on Nov. 13, 2003, at the Collection 
Nationale de Cultures de Microorganismes, 25 rue du Doc- 
teur Roux, 75724 Paris Cedex 15; it contains the cDNA 
sequence corresponding to the region situated between 
ORF-M and ORF-N of the SARS-CoV strain derived from 
the sample recorded under the No. 031589 and collected in 
Hanoi, as defined above, said sequence corresponding to the 
nucleotides at positions 26977 to 28218 (SEQ ID NO: 20), 
with reference to the Genbank sequence accession No. 
AY274119.3. 

[0399] The sequence of the amplicon representing the 
cDNA corresponding to the region containing ORF7 to 
ORF11 (SEQ ID NO: 19) of the SARS-CoV strain derived 
from the sample No. 031589 does not contain differences in 
relation to the corresponding sequences of the isolates 
AY274119-Tor2 and AY278741-Urbani. The sequences of 
ORF7 to 11 of the SARS-CoV strain derived from the 
sample No. 031589 correspond respectively to the 
sequences SEQ ID NO: 22, 24, 26, 28 and 30 in the sequence 
listing appended as an annex. 

2.4) cDNA Encoding the N Protein and Including ORF13 
and ORF 14 

[0400] The cDNA was synthesized and amplified as 
described above for the fragments Sa and Sb. More specifi- 
cally, the reaction mixture containing: 5 ul of RNA, 5 u.1 of 
H 2 0 for injection, 4 ul of 5x reverse transcriptase buffer, 2 
ul of dNTP (5 mM), 2 ul of oligo 20T (5 uM), 0.5 ul of 
RNasin (40 IU/ul) and 1.5 ul of AMV-RT (10 IU/ul 
Promega) was incubated in a thermocycler under the fol- 
lowing conditions: 45 min at 42° C, 15 min at 55° C, 5 min 
at 95° C, and it was then kept at +4° C. 

[0401] A first PCR amplification was performed with the 
pair of primers S/N/F3/+/28023 and S/N/R3/-/29480. 
[0402] The reaction mixture as above for the amplification 
of the SI and S2 fragments was incubated in a thermo- 
cycler, under the following conditions: an initial step of 
denaturation at 94° C. for 2 min was followed by 40 cycles 
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comprising a step of denaturation at 94° C. for 20 sec, a step 
of annealing at 55° C. for 30 sec and then a step of extension 
at 72° C. for 1 min 30 sec with 10 sec of additional extension 
at each cycle, and then a final step of extension at 72° C. for 
5 min. 

[0403] The amplicon obtained at the first PGR amplifica- 
tion was subjected to a second PCR amplification step 
(nested PCR) with the pairs of primer S/N/F4/+/28054 and 
S/N/R4/-/29430 under conditions identical to those of the 
first amplification. 

[0404] The amplification product obtained, corresponding 
to the cDNA encoding the N protein of the SARS-CoV strain 
derived from the sample No. 031589, was sequenced with 
the aid of the primers: S/N/F4/+/28054, S/N/R4/-/29430, 
S/N/+/28468, S/N/+/28918 and S/N/-/28607 and cloned as 
above for the other ORFs, to give the plasmid called 
SARS-N. The DNA of these clones was isolated and 
sequenced with the aid of the universal primers Ml 3 sense 
and M13 antisense, and the primers S/N/+/28468, S/N/+/ 
28918 and S/N/-/28607. 

[0405] The sequence of the amplicon representing the 
cDNA corresponding to ORF-N and including ORF13 and 
ORF14 (SEQ ID NO: 36) of the SARS-CoV strain derived 
from the sample No. 031589 does not contain differences in 
relation to the corresponding sequences of the isolates 
AY274119.3-Tor2 and AY278741-Urbani. The sequence of 
the N protein of the SARS-CoV strain derived from the 
sample No. 031589 corresponds to the sequence SEQ ID 
NO: 37 in the sequence listing appended as an annex. 
[0406] The sequences of ORF13 and 14 of the SARS-CoV 
strain derived from the sample No. 031589 correspond 
respectively to the sequences SEQ ID NO: 32 and 34 in the 
sequence listing appended as an annex. 
[0407] The plasmid, called SARS-N, was deposited under 
the No. 1-3048, on Jun. 5, 2003, at the Collection Nationale 
de Cultures de Microorganismes, 25 rue du Docteur Roux, 
75724 Paris Cedex 1 5; it contains the cDNA encoding the N 
protein of the SARS-CoV strain derived from the sample 
recorded under the No. 031589, as defined above, said 
sequence corresponding to the nucleotides at positions 
28054 to 29430 (SEQ ID NO: 38), with reference to the 
Genbank sequence accession No. AY274119.3. 
2.5) Noncoding 5' and 3' Ends 
a) Noncoding 5' end (5'NC) 
3l ) Synthesis of the cDNA 

[0408] The RNAs derived from the sample 031589, 
extracted as above, were subjected to reverse transcription 
under the following conditions: 

[0409] Hie RNA (1 5 ul) and the primer S/L/-/443 (3 ul at 
the concentration of 5 urn) were incubated for 10 min at 75° 
C. 

[0410] Next, the 5x reverse transcriptase buffer (6 ul, 
INV1TROGEN), 10 Mm dNTP (1 ul), 0.1 M DTT (3 ul) 
were added and the mixture was incubated at 50° C. for 3 

[0411] Finally, the reverse transcriptase (3 ul of Super- 
script®, INVITROGEN) was added to the preceding mix- 
ture which was incubated at 50° C. for 1 h 30 min and then 
at 90° C. for 2 min. 



[0412] The cDNA thus obtained was purified with the aid 
of the QIAquick PCR purification kit (QIAGEN), according 
to the manufacturer's recommendations, 
b,) Terminal Transferase Reaction (TdT) 
[0413] The cDNA (10 ul) is incubated for 2 min at 100°C., 
stored in ice, and the following are then added: H 2 0 (2.5 ul), 
5xTdT buffer (4 ul, AMERSHAM), 5 mM dATP (2 ul) and 
TdT (1.5 ul, AMERSHAM). The mixture thus obtained is 
incubated for 45 min at 37° C. and then for 2 min at 65° C. 
[0414] The product obtained is amplified by a first PCR 
reaction with the aid of the primers: S/L/-225-206 and 
anchor 14T: 5'-AGATGAATTCGGTAC- 

CTTTTTTTTTTTTTT-3' (SEQ ID NO: 68). The amplifica- 
tion conditions are the following: an initial step of denatur- 
ation at 94° C. for 2 rnin is followed by 1 0 cycles comprising 
a step of denaturation at 94° C. for 1 0 sec, a step of annealing 
at 45° C. for 30 sec and then a step of extension at 72° C. 
for 30 sec and then by 30 cycles comprising a step of 
denaturation at 94° C. for 10 sec, a step of annealing at 50° 
C. for 30 sec and then a step of extension at 72° C. for 30 
sec, and then a final step of extension at 72° C. for 5 min. 
[0415] The product of the first PCR amplification was 
subjected to a second amplification step with the aid of the 
primers: S/L/-/204-185 and anchor 14T mentioned above 
under conditions identical to those of the first amplification. 
The amplicon thus obtained was purified, sequenced with 
the aid of the primer S/L/-/182-163 and it was then cloned 
as above for the different ORFs, to give the plasmid called 
SARS-5'NC. The DNA of this clone was isolated and 
sequenced with the aid of the universal primers M13 sense 
and M13 antisense and the primer S/L/-/1 82-163 mentioned 

[0416] The amplicon representing the cDNA correspond- 
ing to the 5'NC end of the SARS-CoV strain derived from 
the sample recorded under the No. 031589 corresponds to 
the sequence SEQ ID NO: 72 in the sequence listing 
appended as an annex; this sequence does not contain 
differences in relation to the corresponding sequences of the 
isolates AY274119.3-Tor2 and AY278741-Urbani. 
[0417] The plasmid, called SARS-5'NC, was deposited 
under the No. 1-3124, on Nov. 7, 2003, at the Collection 
Nationale de Cultures de Microorganismes, 25 rue du Doc- 
teur Roux, 75724 Paris Cedex 15; it contains the cDNA 
corresponding to the noncoding 5' end of the genome of the 
SARS-CoV strain derived from the sample recorded under 
the No. 031589, as defined above, said sequence corre- 
sponding to the nucleotides at positions 1 to 204 (SEQ ID 
NO: 39), with reference to the Genbank sequence accession 
No.AY274119.3. 
b) Noncoding 3' End (3'NC) 
a t ) Synthesis of the cDNA 

[0418] The RNAs derived from the sample 031589, 
extracted as above, were subjected to reverse transcription, 
according to the following protocol: the reaction mixture 
containing: RNA (5 ul), H 2 0 (5 ul), 5x reverse transcriptase 
buffer (4 ul). 5 mM dNTP (2 ul), 5 uM Oligo 20T (2 ul), 40 
U/ul RNasin (0.5 ul) and 10 IU/ul RT-AMV (1.5 ul, 
PROMEGA) was incubated in a thermo-cycler, under the 
following conditions: 45 min at 42° C, 15 min at 55° C, 5 
min at 95° C, and it was then kept at +4° C. 
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[0419] The cDNA obtained was amplified by a first PCR 
reaction with the aid of the primers S/N/+/28468-28487 and 
anchor 14T mentioned above. The amplification conditions 
are the following: an initial step of denaturation at 94° C. for 
2 min is followed by 10 cycles comprising a step of 
denaturation at 94° C. for 20 sec, a step of annealing at 45° 
C. for 30 sec and then a step of extension at 72° C. for 50 
sec and then 30 cycles comprising a step of denaturation at 
94° C. for 20 sec, a step of annealing at 50° C. for 30 sec and 
then a step of extension at 72° C. for 50 sec, and then a final 
step of extension at 72° C. for 5 min. 
[0420] The product of the first PCR amplification was 
subjected to a second amplification step with the aid of the 
primers S/N/+/28933-28952 and anchor 14T mentioned 
above, under conditions identical to those of the first ampli- 
fication. The amplicon thus obtained was purified, 
sequenced with the aid of the primer S/N/+/29257-29278 
and cloned as above for the different ORFs, to give the 
plasmid called SARS-3'NC. The DNA of this clone was 
isolated and sequenced with the aid of the universal primers 
M13 sense and M13 antisense and the primer S/N/+/29257- 
29278 mentioned above. 

[0421] The amplicon representing the cDNA correspond- 
ing to the 3'NC end of the SARS-CoV strain derived from 
the sanple recorded under the No. 031589 corresponds to 
the sequence SEQ ID NO: 73 in the sequence listing 
appended as an annex; this sequence does not contain 



[0422] The plasmid called SARS-3'NC was deposited 
under the No. 1-3123 on Nov. 7, 2003, at the Collection 
Nationale de Cultures de Microorganismes, 25 rue du Doc- 
teur Roux, 75724 Paris Cedex 15; it contains the cDNA 
sequence corresponding to the noncoding 3' end of the 
genome of the SARS-CoV strain derived from the sample 
recorded under the No. 031589, as defined above, said 
sequence corresponding to that situated between the nucle- 
otide at positions 28933 to 29727 (SEQ ID NO: 40), with 
reference to the Genbank sequence accession No. 
AY274119.3, ends with a series of nucleotides a. 

2.6) ORFla and ORFlb 

[0423] The amplification of the 5' region containing 
ORFla and ORFlb of the SARS-CoV genome derived from 
the sample 031589 was performed by carrying out RT-PCR 
reactions followed by nested PCRs according to the same 
principles as those described above for the other ORFs. The 
amplified fragments overlap over several tenths of bases, 
thus allowing computer reconstruction of the complete 
sequence of this part of the genome. On average, the 
amplified fragments are of two kilobases. 

[0424] 14 overlapping fragments, called L0 to L12, were 
thus amplified with the aid of the following primers: 



RT-PCR 



Ll 
L2 

2156-4167 
L3 

3913-5324 

4952-6023 
L4 

5325-7318 

7296-9156 
L6 

9053-11066 
L7 

10928-12962 
L8 



S/L0/F1/+30 

S/L1/F1/+147 

S/L2/F1/+2033 

S/L3bis/Fl/+3850 

S/L4WF1/+4878 

S/L4/F1/+5272 

S/L5/F1/+71U 

S/L6/F1/+8975 

S/L7/F1/+10883 

S/L8/F1/+ 12690 



S/L10/F1/+16451 



S/L0/R1/-481 

S/L1/R1/-2338 

S/L2/R1/-4192 

S/L3bis/Rl/-5365 

S/L4b/Rl/-6061 

S/L4/R1/-7392 

S/L5/R1/-9253 

S/L6/R1/-1U51 

S/L7/R1/- 13050 

S/L8/R1.'- 14857 

S/L9/R1/-16678 

S/L10/R1/-18594 

S/L11/R1/-20612 



S/L1/F2/+211 
S/L2/F2/+2136 



S/L12/F1/+20279 ! 



S/L4WF2/+4932 

S/IAT2/+5305 

S/L5/F2/+7275 

S/L6/F2/+9032 

S/L7/F2/+10928 

S/L8/F2/+12815 

S/L9/F2/+14745 

S/L10/F2/+16514 

S/L11/F2/+18500 

S/L12/F2/+20319 



S/L1/R2/-2241 

S/L2/R2/-4168 

S,'L3bis/R2/-5325 

S/L4WR2/-6024 

S/L4/R2/-7323 

S/L5/R2/-9157 

S/L6/R2/-11067 

S/L7/R2/-12963 

S/L8/R2/- 14835 

S/L9/R2/-16625 

S/L10/R2/-18571 

S/L11/R2/-20583 

S/L12/R2/-22206 



US 2007/0275002 Al 



23 



Nov. 29, 2007 



[0425] All the fragments were amplified under the follow- 
ing conditions, except fragment L0 which was amplified as 
described above for ORF-M: 

[0426] RT-PCR: 30 min at 42° C, 15 min at 55° C, 2 
min at 94° C, and then the cDNA obtained is amplified 
under the following conditions: 40 cycles comprising: 
a step of denaturation at 94° C. for 15 sec, a step of 
annealing at 58° C. for 30 sec and then a step of 
extension at 68° C. for 1 min 30 sec, with 5 sec 
additional extension at each cycle, and then a final step 
of extension at 68° C. for 7 min. 

[0427] Nested PCR: An initial step of denaturation at 
94° C. for 2 min is followed by 35 cycles comprising: 
a step of denaturation at 94° C. for 15 sec, a step of 
annealing at 60° C. for 30 sec and then a step of 
extension at 72° C. for 1 min 30 sec, with 5 sec of 
additional extension at each cycle, and then a final step 
of extension at 72° C. for 7 min. 

[0428] The amplification products were sequenced with 
the aid of the primers defined in table III below: 



ised for the sequencing of the 
region (ORFla and ORFlb) 



S/L6/+; 
S/L6/+, 



10677 
10106 



S/L7/+/12088 
S/L7/+/11551 



-CCGGCATCCAAACATAATTT-3 
-TGGTCAGTAGGGTTGATTGG-3 
-CATCCTTTGTGTCAACATCG-3 



-ATGCGACGAGTCTGCTTCTA- 



-ATCTTGGCGCATGTATTGAC-3 



-CCTTGTGGCAATGAAGTACA- 3 



-CTTCAATGGTTTGCCATGTT-3 
-TGCGAGCTGTCATGAGAATA-3 
-AACCGAGAGCAGTACCACAG-3 



-GAGC AGGCTGTAGCTAATGG- 3 
-TTAGGCTATTGTTGCTGCTG- 3 



S/L8/-13160 
S/L8/-/13704 
S/L8/-14284 
S/L8/+/14453 
S/L8/+/13968 
S/L8/+/13401 
S/L9/-15098 



S/L9/+15858 
S/L9/+15288 
S/L10/-16914 

S/L10/-18022 
S/L10/+18245 
S/L10/+17663 
S/L10/+17061 



S/L11/-20002 
S/L11/+20245 
S/L11/+/19611 

SARS/L1/F3/+800 
SARS/L1/F4/+1391 
SARS/L1/F5/+1925 
SARS/L1/R3/-1674 
SARS/L1/R4/-1107 
SARS/L1/R5/-520 



SARS/L2/P5/+3746 
SARS/L2/R3/-3579 



- CGCTGACGTGATATATGTGG- 3 



GGCATTGTAGGCGTACTGAC-3 
•GTTTGCGGTGTAAGTGCAG-3 ' 



• CCTTACCCAGATCCATCAAG-3 
CGCAAACATAACACTTGCTG-3 
■AGTGTTGGGTACAAGCCAGT- 3 
■GTTCCAAGGAACATGTCTGG-3 

• AGGTGCCTGTGTAGGATGAA- 3 



•GCAAGCAGAATTAACCCTCA- 



•TGGTCCCTTTGAAGGTGTTA-3 ' 
■TCGAACACATCGTTTATGGA- 3 



•GAGGTGCAGTCACTCGCTAT-3 
-CAGAGATTGGACCTGAGCAT-3 



-CACGTGGTTGAATGACTTTG-3 
■ ATTTCTGCAACCAGCTCAAC- 3 



■ TTTCTTCACCAGCATC ATCA- 3 



US 2007/0275002 Al 



24 



Nov. 29, 2007 



TABLE Ill-continued 



standard conditions, with the aid of the DNA polymerase 
Platinum Pfx® (INVITROGEN). The plasmids SRAS-N 
and SRAS-S were used as template and the following 
oligo-nucleotides as primers: 




[0429] The sequences of the fragments LO to LI 2 of the 
SARS-CoV strain derived from the sample recorded under 
the No. 031589 correspond respectively to the sequences 
SEQ ID NO: 41 to SEQ ID NO: 54 in the sequence listing 
appended as an annex. Among these sequences, only that 
corresponding to the fragments L5 contains a nucleotide 
difference in relation to the corresponding sequence of the 
isolate AY278741-Urbani. This t/c mutation at position 7919 
results in a modification of the amino acid sequence of the 
corresponding protein, encoded by ORFla: at position 2552, 
a valine (gtt codon; AY278741) is changed to alanine (get 
codon) in the SARS-CoV strain 031589. By contrast, no 
mutation was identified in relation to the corresponding 
sequence of the isolate AY2741 19.3-Urbani. The other frag- 
ments do not exhibit differences in relation to the corre- 
.g sequences of the isolates Tor2 and Urbani. 



EXAMPLE 2 

Production and Purification of the Recombinant N 
and S Proteins of the SARS-CoV Strain Derived 
from the Sample Recorded Under the Number 
031589 

[0430] The entire N protein and two polypeptide frag- 
ments of the S protein of the SARS-CoV strain derived from 
the sample recorded under the number 031589 were pro- 
duced in E. coli, in the form of fusion proteins comprising 
an N- or C-terminal polyhistidine tag. In the two S polypep- 
tides, the N- and C-terminal hydrophobic sequences of the 
S protein (signal peptide: positions 1 to 13 and transmem- 
brane helix: positions 1196 to 1218) were deleted whereas 
the (3 helix (positions 565 to 687) and the two motifs of the 
coiled-coil type (positions 895 to 980 and 1 1 55 to 1 1 86) of 
the S protein were preserved. These two polypeptides con- 
sist of: a long fragment (S L ) corresponding to positions 14 
to 1193 of the amino acid sequence of the S protein and a 
short fragment (S c ) corresponding to positions 475 to 1193 
of the amino acid sequence of the S protein. 
1) Cloning of the cDNAS N, S L and S c into the Expression 
Vectors pIVEX2.3 and plVEX2.4 

[0431] The cDNAs corresponding to the N protein and to 
the S L and S c fragments were amplified by PGR under 



(S e and S[, antisense, SEQ ID NO: 29). 

[0432] The sense primers introduce an Ndel site (under- 
lined) while the antisense primers introduce an Xmal or 
Smal site (underlined). The 3 amplification products were 
column purified [QIAquickPCR Purification kit, QIAGEN) 
and cloned into an appropriate vector. The plasmid DNA 
purified from the 3 constructs (QIAFilter Midi Plasmid kit, 
QIAGEN) was verified by sequencing and digested with the 
enzymes Ndel and Xmal. The 3 fragments corresponding to 
the cDNAs N, S L and S c were purified on agarose gel and 
then inserted into the plasmids p]VEX2.3MCS(C -terminal 
polyhistidine tag) and pIVEX2.4d (N-terminal polyhistidine 
tag) digested beforehand with the same enzymes. After 
verification of the constructs, the 6 expression vectors thus 
obtained (pIV2.3N, pIV2.3S c , pIV2.3S L> pIV2.4N, 
plV2.4S c also called PIV2.4S,, pIV2.4S L ) were then used, 
on the one hand to test the expression of the proteins in vitro, 
and on the other hand to transform the bacterial strain 
BL21 (DE3)pDIAl 7 (NOVAGEN). These constructs encode 
proteins whose expected molecular mass is the following: 
pIV2.3N (47174 Da), pIV2.3S c (82897 Da), pIV2.3S L 
(132056 Da), pIV2.4N (48996 Da), pIV2.4S, (81076 Da) 
and pIV2.4S L (133877 Da). Bacteria transformed with 
plV2.3N were deposited at the CNCM on Oct. 23, 2003, 
under the number 1-3117, and bacteria transformed with 
p]V2.4S i were deposited at the CNCM on Oct. 23, 2003, 
under the number 1-3118. 



[0433] The expression of recombinant proteins from the 6 
recombinant vectors was tested, in a first instance, in a 
system in vitro (RTS100, Roche). The proteins produced in 
vitro, after incubation of the recombinant vectors pIVEX for 
4 h at 30° C, in the RTS100 system, were analyzed by 
Western blotting with the aid of an anti-(his) 6 antibody 
coupled to peroxidase. The result of expression in vitro 
(FIG. 1) shows that only the N protein is expressed in large 
quantities, regardless of the position, N- or C-terminal, of 
the polyhistidine tag. In a second step, the expression of the 
N and S proteins was tested in vivo at 30° C. in LB medium 
in the presence or in the absence of inducer (1 mM IPTG). 
The N protein is very well produced in this bacterial The 
sequences of the fragments L0 to LI 2 of the SARS-CoV 
strain derived from the sample recorded under the No. 
031589 correspond respectively to the sequences SEQ ID 
NO: 41 to SEQ ID NO: 54 in the sequence listing appended 
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as an annex. Among these sequences, only that correspond- 
ing to the fragments L5 contains a nucleotide difference in 
relation to the corresponding sequence of the isolate 
AY278741 -Urbani. This t/c mutation at position 791 9 results 
in a modification of the amino acid sequence of the corre- 
sponding protein, encoded by ORFla: at position 2552, a 
valine (gtt codon; AY278741) is changed to alanine (get 
codon) in the SARS-CoV strain 031589. By contrast, no 
mutation was identified in relation to the corresponding 
sequence of the isolate AY274119.3-Urbani. The other frag- 
ments do not exhibit differences in relation to the corre- 
sponding sequences of the isolates Tor2 and Urbani. 

EXAMPLE 2 

Production and Purification of the Recombinant N 
and S Proteins of the SARS-CoV Strain Derived 
from the Sample Recorded Under the Number 
031589 

[0434] The entire N protein and two polypeptide frag- 
ments of the S protein of the SARS-CoV strain derived from 
the sample recorded under the number 031589 were pro- 
duced in E. coli, in the form of fusion proteins comprising 
an N- or C-terminal polyhistidine tag. In the two S polypep- 
tides, the N- and C-terminal hydrophobic sequences of the 
S protein (signal peptide: positions 1 to 13 and transmem- 
brane helix: positions 1196 to 1218) were deleted whereas 
the p helix (positions 565 to 687) and the two motifs of the 
coiled-coil type (positions 895 to 980 and 1155 to 1186) of 
the S protein were preserved. These two polypeptides con- 
sist of: a long fragment (S L ) corresponding to positions 14 
to 1193 of the amino acid sequence of the S protein and a 
short fragment (S c ) corresponding to positions 475 to 1193 
of the amino acid sequence of the S protein. 
1) Cloning of the cDNAS N, S L and S c into the Expression 
Vectors pIVEX2.3 and plVEX2.4 

[0435] The cDNAs corresponding to the N protein and to 
the S L and S c fragments were amplified by PCR under 
standard conditions, with the aid of the DNA polymerase 
Platinum Pfx® (INV1TROGEN). The plasmids SRAS-N 
and SRAS-S were used as template and the following 
oligo-nucleotides as primers: 



5 ' -CC CATATG TCTGATAATGGACCCCAATCAAAC-3 ' 
(N sense, SEQ ID NO: 55) 

5 ' -CC CCCGGG TGCCTGAGTTGAATCAGCAGAAGC-3 ' 
(N antisense, SEQ ID HO: 56) 

5 ' -CCCATATGAGTGACCTTGACCGGTGCACCAC- 3 ' 
(S c sense, SEQ ID NO: 57) 

5 ' -CCCATAIGAAACCTTGCACCCCACCTGCTC-3 ' 

[0436] The sense primers introduce an Ndel site (under- 
lined) while the antisense primers introduce an Xmal or 
Smal site (underlined). The 3 amplification products were 
column purified (QIAquickPCR Purification kit, QIAGEN) 
and cloned into an appropriate vector. The plasmid DNA 
purified from the 3 constructs {QIAFilter Midi Plasmid kit, 
QIAGEN) was verified by sequencing and digested with the 



enzymes Ndel and Xmal. The 3 fragments corresponding to 
the cDNAs N, S L and S c were purified on agarose gel and 
then inserted into the plasmids pIVEX2.3MCS(C-terminal 
polyhistidine tag) and pIVEX2.4d (N-terminal polyhistidine 
tag) digested beforehand with the same enzymes. After 
verification of the constructs, the 6 expression vectors thus 
obtained (pIV2.3N, pIV2.3S c , pIV2.3S L , pIV2.4N, 
plV2.4S c also called prV2.4S 1; pIV2.4S L ) were then used, 
on the one hand to test the expression of the proteins in vitro, 
and on the other hand to transform the bacterial strain 
BL21(DE3)pDIA17 (NOVAGEN). These constructs encode 
proteins whose expected molecular mass is the following: 
pIV2.3N (47174 Da), pIV2.3S c (82897 Da), pIV2.3S L 
(132056 Da), pIV2.4N (48996 Da), PIV2.4SJ (81076 Da) 
and pIV2.4S L (133877 Da). Bacteria transformed with 
p]V2.3N were deposited at the CNCM on Oct. 23, 2003, 
under the number 1-3117, and bacteria transformed with 
pIV2.4S j were deposited at the CNCM on Oct. 23, 2003, 
under the number 1-3118. 

2) Analysis of the Expression of the Recombinant Proteins 
In Vitro and In Vivo 

[0437] The expression of recombinant proteins from the 6 
recombinant vectors was tested, in a first instance, in a 
system in vitro (RTS100, Roche). The proteins produced in 
vitro, after incubation of the recombinant vectors pIVEX for 
4 h at 30° C, in the RTS100 system, were analyzed by 
Western blotting with the aid of an anti-(his) 6 antibody 
coupled to peroxidase. The result of expression in vitro 
(FIG. 1) shows that only the N protein is expressed in large 
quantities, regardless of the position, N- or C-terminal, of 
the polyhistidine tag. In a second step, the expression of the 
N and S proteins was tested in vivo at 30° C. in LB medium 
in the presence or in the absence of inducer (1 mM IPTG). 
The N protein is very well produced in this bacterial system 
(FIG. 2) and is found mainly in a soluble fraction after lysis 
of the bacteria . By contrast, the long version of S (S L ) is very 
weakly produced and is completely insoluble (FIG. 3). The 
short version (S c ) also exhibits a very weak solubility, but 
an expression level that is much higher than that of the long 
version. Moreover, the construct S c fused with a polyhisti- 
dine tag at the C-terminal position has a smaller size than 
that expected. An immunodetection experiment with an 
anti-polyhistidine antibody has shown that this construct 
was incomplete. In conclusion, the two constructs, pIV2.3N 
and pIV2.4S,, which express respectively the entire N 
protein fused with the C-terminal polyhistidine tag and the 
short S protein fused with the N-terminal polyhistidine tag, 
were selected in order to produce the two proteins in a large 
quantity so as to purify them. The plasmids pIV2.3N and 
pIV2.4S were deposited respectively under the No. 1-31 17 
and 1-3118 at the CNCM, 25 rue du Docteur Roux, 75724 
PARIS 15, on Oct. 23, 2003. 

3) Analysis of the Antigenic Activity of the Recombinant 
Proteins 

[0438] The antigenic activity of the N, S L and S c proteins 
was tested by Western blotting with the aid of two serum 
samples, obtained from the same patient infected with 
SARS-CoV, collected 8 days (M12) and 29 days (M13) after 
the onset of the SARS symptoms. The experimental protocol 
is as described in example 3. The results illustrated by FIG. 
4 show (i) the seroconversion of the patient, and (ii) that the 
N protein possesses a higher antigenic reactivity than the 
short S protein. 
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4) Purification of the N Protein from pIV2.3N 

[0439] Several experiments for purifying the N protein, 
produced from the vector pIV2.3N, were carried out accord- 
ing to the following protocol. The bacteria 
BL2 1 (DE3)pDIAl 7, transformed with the expression vector 
pIV2.3N, were cultured at 30° C. in 1 liter of culture 
medium containing 0.1 mg/ml of ampicillin, and induced 
with 1 mM IPTG when the cell density equivalent to 
A 600 =0.8 is reached (about 3 hours). After 2 hours of culture 
in the presence of inducer, the cells were recovered by 
centrifugation (1 0 min at 5000 rpm), resuspended in the lysis 
buffer (50 mM NaH 2 P0 4! 0.3 M NaCl, 20 mM imidazole, 
pH 8, containing the mixture of protease inhibitors Com- 
plete®, Roche), and lysed with the French press (12 000 
psi). After centrifugation of the bacterial lysate (1 5 min at 1 2 
000 rpm), the supernatant (50 ml) was deposited at a flow 
rate of 1 rnl/min on a metal chelation column (15 ml) 
(Ni-NTA superflow, Qiagen), equilibrated with the lysis 
buffer. After washing the column with 200 ml of lysis buffer, 
the N protein was eluted with an imidazole gradient (20-> 
250 mM) in 10 column volumes. The fractions containing 
the N protein were assembled and analyzed by polyacryla- 
mide gel electrophoresis under denaturing conditions fol- 
lowed by staining with Coomassie blue. The results illus- 
trated by FIG. 5 show that the protocol used makes it 
possible to purify the N protein with a very satisfactory 
homogeneity (95%) and a mean yield of 1 5 mg of protein 
per liter of culture. 

5) Purification of the S c Protein from pIV2.4S c (piV2.4S,) 

[0440] The protocol followed for purifying the short S 
protein is very different from that described above because 
the protein is highly aggregated in the bacterial system 
(inclusion bodies). The bacteria BL21(DE3)pDIA17, trans- 
formed with the expression vector PIV2.4SJ, were cultured 
at 30° C. in 1 liter of culture medium containing 0.1 mg/ml 
of ampicillin, and induced with 1 mM IPTG when the cell 
density equivalent to A 600 =0.8 is reached (about 3 hours). 
After 2 hours of culture in the presence of inducer, the cells 
were recovered by centrifugation (10 min at 5000 rpm), 
resuspended in the lysis buffer (0.1 M Tris-HCl, 1 mM 
EDTA, pH 7.5), and lysed with the French press (1200 psi). 
After centrifugation of the bacterial lysate (1 5 min at 1 2 000 
rpm), the pellet was resuspended in 25 ml of lysis buffer 
containing 2% Triton X100 and 10 mM (3-mercaptoethanol, 
and then centrifuged for 20 min at 12 000 rpm. The pellet 
was resuspended in 10 mM Tris-HCl buffer containing 7 M 
urea, and gently stirred for 30 min at room temperature. This 
final washing of the inclusion bodies with 7 M urea is 
necessary in order to remove most of the E. coli membrane 
proteins which co-sediment with the aggregated S c protein. 
After a final centrifugation for 20 min at 12 000 rpm, the 
final pellet is resuspended in the 10 mM Tris-HCl buffer. The 
electrophoretic analysis of this preparation (FIG. 6) shows 
that the short S protein may be purified with a satisfactory 
homogeneity (about 90%) from the inclusion bodies 
(insoluble extract). 

EXAMPLE 3 
Immunodominance of the N Protein 

[0441] The reactivity of the antibodies present in the 
serum of patients suffering from atypical pneumopathy 



caused by the SARS-associated coronavirus (SARS-CoV), 
toward the various proteins of this virus, was analyzed by 
Western blotting under the conditions described below. 

1 ) Materials 

a) Lysate of Cells Infected with SARS-CoV 

[0442] Vero E6 cells (2xl0 6 ) were infected with SARS- 
CoV (isolate recorded under the number FFM/MA104) at a 
multiplicity of infection (M.O.I.) of 10" 1 or 10~ 2 and then 
incubated in DMEM medium containing 2% FCS, at 35° C. 
in an atmosphere containing 5% C0 2 . 48 hours later, the 
cellular lawn was washed with PBS and then lysed with 500 
ul of loading buffer prepared according to Laemmli and 
containing (5-mercaptoethanol. The samples were then 
boiled for 10 minutes and then sonicated for 3 times 20 
seconds. 

b) Antibodies 

b[) Serum from a Patient Suffering from Atypical Pneum- 
opathy 

[0443] The serum designated by a reference at the 
National Reference Center for Influenza Viruses (Northern 
region) under the No. 20033 1 68 is that from a French patient 
suffering from atypical pneumopathy caused by SARS-CoV 
collected on day 38 after the onset of the symptoms; the 
diagnosis of SARS-CoV infection was performed by nested 
RT-PCR and quantitative PCR. 

b 2 ) Monospecific Rabbit Polyclonal Sera Directed Against 
the N Protein or the S Protein 

[0444] The sera are those produced from the recombinant 
N and S c proteins (example 2), according to the immuni- 
zation protocol described in example 4; they are the rabbit 
P13097 serum (anti-N serum) and the rabbit P11135 serum 
(anti-S serum). 

2) Method 

[0445] 20 of lysate of cells infected with SARS-CoV at 
M.O.I. values of 10" 1 and 10" 2 and, as a control, 20 ul of a 
lysate of noninfected cells (mock) were separated on 10% 
SDS polyacrylamide gel and then transferred onto a nitro- 
cellulose membrane. After blocking in a solution of PBS/5% 
milk/0.1% Tween and washing in PBS/0.1% Tween, this 
membrane was hybridized overnight at 4° C. with: (i) the 
immune serum No. 20033168 diluted VSoo, Viooo and Wooo in 
the buffer PBS/1% BSA/0.1% Tween, (ii) the rabbit P13097 
serum (anti-N serum) diluted '/so ooo in the same buffer and 
(iii) the rabbit P11135 serum (anti-S serum) diluted Vio 000 
in the same buffer. After washing in PBS/Tween, a second- 
ary hybridization was performed with the aid of either sheep 
polyclonal antibodies directed against the heavy and light 
chains of human G immunoglobulins and coupled with 
peroxidase (NA933V, Amersham), or of donkey polyclonal 
antibodies directed against the heavy and light chains of the 
rabbit G immunoglobulins and coupled with peroxidase 
(NA934V, Amersham). The bound antibodies were visual- 
ized with the aid of the ECL+ kit (Amersham) and of 
Hyperfilm MP autoradiography films (Amersham). A 
molecular mass ladder (kDa) is presented in the figure. 

3) Results 

[0446] FIG. 7 shows that three polypeptides of apparent 
molecular mass 35, 55 and 200 kDa are specifically detected 
in the extracts of cells infected with SARS-CoV. 
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[0447] In order to identify these polypeptides, two other 
immunoblots (FIG. 8) were prepared on the same samples 
and under the same conditions with rabbit polyclonal anti- 
bodies specific for the nucleoprotein N (rabbit PI 3097, FIG. 
8A) and for the spicule protein S (rabbit PI 11 35, FIG. 8B). 
This experiment shows that the 200 kDa polypeptide cor- 
responds to the SARS-CoV spicule glycoprotein S, that the 
55 kDa polypeptide corresponds to the nucleoprotein N 
while the 35 kDa polypeptide probably represents a trun- 
cated or degraded form of N. 

[0448] The data presented in FIG. 7 therefore show that 
the serum 20033168 strongly reacts with N and a lot more 
weakly with the SARS-CoV S since the 35 and 55 kDa 
polypeptides are visualized in the form of intense bands for 
VSoo, Vlooo and V3000 dilutions of the immunoserum whereas 
the 200 kDa polypeptide is only weakly visualized for a 
dilution of Vioo. It is also possible to note that no other 
SARS-CoV polypeptide is detected for dilutions greater than 
ysoo of the serum 20033168. 

[0449] This experiment indicates that the antibody 
response specific for the SARS-CoV N dominates the anti- 
body responses specific for the other SARS-CoV polypep- 
tides and in particular the antibody response directed against 
the S glycoprotein. It indicates an immuno-dominance of the 
nucleoprotein N during human infections with SARS-CoV. 

EXAMPLE 4 

Preparation of Monospecific Polyclonal Anti-Bodies 
Directed Against the SARS-Associated Coronavirus 
(SARS-CoV) N and S Proteins 

1) Materials and Method 

[0450] Three rabbits (P13097, P13081, P13031) were 
immunized with the purified recombinant polypeptide cor- 
responding to the entire nucleoprotein (N), prepared accord- 
ing to the protocol described in example 2. After a first 
injection of 0.35 mg per rabbit of protein emulsified in 
complete Freund's adjuvant (intradermal route), the animals 
received 3 booster injections at 3 and then 4 weeks' interval, 
of 0.35 mg of recombinant protein emulsified in incomplete 
Freund's adjuvant. 

[0451] Three rabbits (PI 1135, P13042, PI 4001) were 
immunized with the recombinant polypeptide corresponding 
to the short fragment of the S protein (S c ) produced as 
described in example 2. As this polypeptide is found mainly 
in the form of inclusion bodies in the bacterial cytoplasm, 
the animals received 4 intradermal injections at 3-4 weeks' 
interval of a preparation of inclusion bodies corresponding 
to 0.5 mg of recombinant protein emulsified in incomplete 
Freund's adjuvant. The first 3 injections were made with a 
preparation of inclusion bodies prepared according to the 
protocol described in example 2, while the fourth injection 
was made with a preparation of inclusion bodies which were 
prepared according to the protocol described in example 2 
and then purified on sucrose gradient and washed in 2% 
Triton X100. 

[0452] For each rabbit, a preimmune (p.i.) serum was 
prepared before the first immunization and an immune 
serum (I.S.) 5 weeks after the fourth immunization. 
[0453] In a first instance, the reactivity of the sera was 
analyzed by ELISA test on preparations of recombinant 



proteins similar to those used for the immunizations; the 
ELISA tests were carried out according to the protocol and 
with the reagents as described in example 6. 

[0454] In a second instance, the reactivity of the sera was 
analyzed by preparing an immunoblot (Western blot) of a 
lysate of cells infected with SARS-CoV, according to the 
protocol as described in example 3. 

2) Results 

[0455] The ELISA tests (FIG. 9) demonstrate that the 
preparations of recombinant N protein and of inclusion 
bodies of the short fragment of the S protein (S c ) are 
immunogenic in animals and that the titer of the immune 
sera is high (more than Vis ooo). 

[0456] The immunoblot (FIG. 8) shows that the rabbit 
P13097 immune serum recognizes two polypeptides present 
in the lysates of cells infected with SARS-CoV: a polypep- 
tide whose apparent molecular mass (50-55 kDa based on 
experiments) is compatible with that of the nucleo-protein N 
(422 residues, predicted molecular mass of 46 kDa) and a 
polypeptide of 35 kDa, which probably represents a trun- 
cated or degraded form of N. 

[0457] This experiment also shows that the rabbit PI 1 135 
serum mainly recognizes a polypeptide whose apparent 
molecular mass (180-220 kDa based on experiments) is 
compatible with a glycosylated form of S (1255 residues, 
nonglycosylated polypeptide chain of 139 kDa), as well as 
lighter polypeptides, which probably represent truncated 
and/or nonglycosylated forms of S. 

[0458] In conclusion, all these experiments demonstrate 
that the recombinant polypeptides expressed in E. coli and 
corresponding to the SARS-CoV N and S proteins make it 
possible to induce, in animals, polyclonal antibodies capable 
of recognizing the native forms of these proteins. 

EXAMPLE 5 

Preparation of Monospecific Polyclonal Anti-Bodies 
Directed Against the SARS-Associated Coronavirus 
(SARS-CoV) M and E Proteins 

1) Analysis of the Structure of the M and E Proteins 

a) E Protein 

[0459] The structure of the SARS-CoV E protein (76 
amino acids) was analyzed in silico, with the aid of various 
software packages such as signalP vl.l, NetNGlyc 1.0, 
THMM 1.0 and 2.0 (Krogh et al.. 2001, J. Mol. Biol, 
305(3):567-580) or alternatively TOPPRED (von Heijne, 
1992, J. Mol. Biol. 225, 487-494). The analysis shows that 
this nonglycosylated polypeptide is a type 1 membrane 
protein, containing a single transmembrane helix (aa 12-34 
according to THMM), and in which the majority of the 
hydrophilic domain (42 residues) is located at the C-terminal 
end and probably inside the viral particle (endodomain). It 
is possible to note an inversion in the topology predicted by 
versions 1 .0 (N-ter is external) and 2.0 (N-ter is internal) of 
the THMM software, but that other algorithms, in particular 
TOPPRED and THUMBUP (Zhou et Zhou, 2003, Protein 
Science 12:1547-1555) confirm an external location of the 
N-terminal end of E. 
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b) M Protein 

[0460] A similar analysis carried out on the SARS-CoV M 
protein (221 amino acids) shows that this polypeptide does 
not possess a signal peptide (according to the software 
signalP vl.l) but three transmembrane domains (residues 
15-37, 50-72, 77-99 according to THMM 2.0) and a large 
hydrophilic domain (aa 100-221) located inside the viral 
particle (endodomain). It is probably glycosylated on the 
asparagine at position 4 (according to NetNGlyc 1.0). 
[0461] Thus, in agreement with the experimental data 
known for the other coronaviruses, it is remarkable that the 
two M and E proteins exhibit endodomains corresponding to 
the majority of the polypeptides and of the ectodomains that 
are very small in size. 

[0462] The ectodomain of E probably corresponds to 
residues 1 to 11 or 1 to 12 of the protein: MYSFV- 
SEBTGT(L), SEQ ID NO: 70. Indeed, the probability 
associated with the transmembrane location of residue 
12 is intermediate (0.56 according to THMM 2.0). 

[0463] The ectodomain of M probably corresponds to 
residues 2 to 14 of the protein: ADNGTITVEELKQ, 
SEQ ID NO: 69. Indeed, the N-terminal methionine of 
M is very probably cleaved from the mature polypep- 
tide because the residue at position 2 is an alanine 
(Varshavsky, 1996, 93:12142-12149). 

[0464] Moreover, the analysis of the hydrophobicity (Kyte 
& Doolittle, Hopp & Woods) of the E protein demonstrates 
that the C-terminal end of the endodomain of E is hydro- 
philic and therefore probably exposed at the surface of this 
domain. Thus, a synthetic peptide corresponding to this end 
is a good immunogenic candidate for inducing, in animals, 
antibodies directed against the endodomain of E. Conse- 
quently, a peptide corresponding to 24 C-terminal residues 
of E was synthesized. 

2) Preparation of Antibodies Directed Against the 
Ectodomain of the M and E Proteins and the Endodomain of 
the E Protein 

[0465] The peptides M2-14 (ADNGTITVEELKQ, SEQ 
ID NO: 69), El-12 (MYSFVSEETGTL. SEQ ID NO: 70) 
and E53-76 (KPTVYVYSRV KNLNSSEGVP DLLV, SEQ 
ID NO: 71) were synthesized by Neosystem. They were 
coupled with KLH (Keyhole Limpet Hemocyanin) with the 
aid of MBS (m-maleimido-benzoyl-N-hydroxysuccinimide 
ester) via a cysteine added during the synthesis either at the 
"the peptide (case for E53-76) or at the 
is (case of M2-14 and El-12). 
[0466] Two rabbits were immunized with each of the 
conjugates, according to the following immunization proto- 
col: after a first injection of 0.5 mg of peptide coupled with 
KLH and emulsified in complete Freund's adjuvant (intra- 
dermal route), the animals receive 2 to 4 booster injections 
at 3 or 4 weeks' interval of 0.25 mg of peptide coupled to 
KLH and emulsified in incomplete Freund's adjuvant. 
[0467] For each rabbit, a preimmune (p.i.) serum was 
prepared before the first immunization and an immune 
serum (l.S.) is prepared 3 to 5 weeks after the booster 
injections. 

[0468] The reactivity of the sera was analyzed by Western 
blotting with the aid of extracts of cells infected with 



SARS-CoV (FIG. 43B) or with the aid of extracts of cells 
infected with a recombinant vaccinia virus expressing the 
protein E (W-TG-E, FIG. 43A) or M (W'-TN-M, FIG. 
43C) of the SARS-CoV 031589 isolate. 

[0469] The immune sera of the rabbits 22234 and 22240, 
immunized with the conjugate KLH-E53-76, recognize a 
polypeptide of about 9 to 10 kD, which is present in the 
extracts of cells infected with SARS-CoV but absent from 
the extracts of noninfected cells (FIG. 43B). The apparent 
mass of this polypeptide is compatible with the predicted 
mass of the E protein, which is 8.4 kD. Similarly, the 
immune serum of the rabbit 20047, immunized with the 
conjugate KLH-E1-12, recognizes a polypeptide present in 
the extracts of cells infected with the W-TG-E virus, whose 
apparent molar mass is compatible with that of the E protein 
(FIG. 43A). 

[0470] The immune serum of the rabbits 20013 and 
20080, immunized with the conjugate KLH-M2-14, recog- 
nizes a polypeptide present in the extracts of cells infected 
with the W-TN-M virus (FIG. 43C), whose apparent molar 
mass (about 18 kD) is compatible with that of the glyco- 
protein M, which is 25.1 kD and has a high iso-electric point 
(9.1 for the naked polypeptide). 

[0471] These results demonstrate that the peptides El-12 
and E53-76, on the one hand, and the peptide M2-14, on the 
other hand, make it possible to induce, in animals, poly- 
clonal antibodies capable of recognizing the native forms of 
the SARS-CoV E and M proteins, respectively. 

EXAMPLE 6 

Analysis of the ELISA Reactivity of the 
Recombinant N Protein Toward Sera from Patients 
Suffering from SARS 



[0472] The antigen used to prepare the solid phases is the 
purified recombinant nucleoprotein N prepared according to 
the protocol described in example 2. 

[0473] The sera to be tested (table IV) were chosen on the 
basis of the results of analysis of their reactivity by immu- 
nofluorescence (IF-SARS titer), toward cells infected with 
SARS-CoV. 



Reference No. Type of si 



l " His 

032632 
032791 



ait 1 -SARS Apr. 27, 2003 (D38) 

:nt-I SARS May 11, 2005 (D52) 

: nt-2 SARS Mar. 21,2003 (D17) 

■M-3 SARS Apr. 04, 2003 (D3) 

mt-3 SARS Apr. 28, 2003 (D27) 



le SARS symptoms. 
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2) Method 

[0474] The N protein (100 jil) diluted at various concen- 
trations in 0.1 M carbonate buffer, pH 9.6 (1, 2 or 4 ug/ml) 
is distributed into the wells of ELISA plates, and then the 
plates are incubated overnight at laboratory temperature. 
The plates are washed with PBS-Tween buffer saturated 
with PBS-skirnmed milk-sucrose (5%) buffer. The test sera 
(100 ul), diluted beforehand (V4o, Vioo, V200, Vaoo, Yam, Vieoo 
and Vnoo) are added and then the plates are incubated for 1 
h at 37° C. After 3 washings, the peroxidase-labeled anti- 
human IgG conjugate (reference 209-035-098, JACKSON) 
diluted Ms 000 is added and then the plates are incubated for 
1 h at 37° C. After 4 washings, the chromogen (TMB) and 
the substrate (H 3 0 2 ) are added and the plates are incubated 
for 30 min at room temperature, protected from light. The 
reaction is then stopped and then the absorbance at 450 nm 
is measured with the aid of an automated reader. 

3) Results 

[0475] The ELISA tests (FIG. 10) demonstrate that the 
recombinant N protein preparation is specifically recognized 
by the antibodies of sera from patients suffering from SARS 
collected in the late phase of the infection (S17 days after 
the onset of the symptoms) whereas it is not significantly 
recognized by the antibodies of a patient's serum collected 
in the early phase of the infection (3 days after the onset of 
the symptoms) or by control sera from subjects not suffering 
from SARS. 

EXAMPLE 7 

KLISA Tests Prepared for a Very Specific and 
Sensitive Detection of a SARS-Associated 
Coronavirus Infection, from Sera of Patients 

1) Indirect ELISA IgG Test 
a) Reagents 

Preparation of the Plates 

[0476] The plates are sensitized with a solution of N 
protein at 2 ug/ml in a 10 mM PBS buffer, pH 7.2, phenol 
red at 0.25 ml/1. 100 ul of solution are deposited in the wells 
and left to incubate at room temperature overnight. Satura- 
tion is obtained by prewashing in 10 mM PBS/0.1% Tween 
buffer, followed by washing with a saturation solution PBS, 
25% milk/sucrose. 

Diluent Sera 

[0477] Buffer 0.48 g/1 TRIS, 10 mM PBS, 3.7 g/1 EDTA, 
15% v/v milk, pH 6.7 
Diluent Conjugate 

[0478] Citrate buffer (15 g/1), 0.5% Tween, 25% bovine 
serum, 12% NaCl, 6% v/v skimmed milk pH 6.5 
Conjugate 

[0479] 50x anti-human IgG conjugate, marketed by Bio- 
Rad: Platelia H. pylori kit ref 72778 
Other Solutions: 

[0480] Washing solution R2, solutions for visualizing with 
TMB R8 diluent, R9 chromogen, RIO stopping solution: 
reagents marketed by Bio-Rad (e.g.: Platelia pylori kit. ref 
72778) 



b) Procedure 

[0481] Dilute the sera V200 in the sample diluent 
[0482] Distribute 100 ul/well 
[0483] Incubation 1 h at 37° C. 

[0484] 3 washings in lOx WASHING solution R2 diluted 
before-hand 10-fold in demineralized water (i.e., lx wash- 
ing solution) 

[0485] Distribute 1 00 ul of conjugate (50x conjugate to be 
diluted immediately before use in the diluent conjugate 
provided) 

[0486] Incubation 1 h at 37° C. 

[0487] 4 washings in 1 x washing solution 

[0488] Distribute 200 ul/well of visualization solution (to 
be diluted immediately before use e.g.: 1 ml of R9 in 10 ml 
ofR8) 

[0489] Incubation for 30 min at room temperature hi the 
dark 

[0490] Stop the reaction with 100 ul/well of R10 

[0491] READING at 450/620 nm 

[0492] The results can be interpreted by taking a 
THRESHOLD serum giving a response above which the 
sera tested would be considered as positive. This serum is 
chosen and diluted so as to give a significantly higher signal 
than the background noise. 

2) Double Epitope Elisa Test 

a) Reagents 

Preparation of the Plates 

[0493] The plates are sensitized with a solution of N 
protein at 1 g/inl in a 10 mM PBS buffer, pH 7.2, phenol red 
at 0.25 ml/1. 100 ul of solution are deposited in the wells and 
left to incubate at room temperature overnight. Saturation is 
obtained by prewashing in 10 mM PBS/0.1% Tween buffer, 
followed by washing with a saturation solution 10 mM PBS, 
25% (V/V) milk. 

Diluent Sera and Conjugate 

[0494] Buffer 50 mM TRIS saline, pH 8, 2% milk 
Conjugate 

[0495] This is the purified recombinant N protein coupled 
with peroxidase according to the Nakane protocol (Nakane 
P. K. and Kawaoi A.; (1974): Peroxydase-labeled antibody, 
a new method of conjugation. The Journal of Histochemistry 
and Cytochemistry Vol. 22, N) 23, pp. 1084-1091), in 
respective molar ratios V2. This ProtN POD conjugate is 
used at a concentration of 2 ug/ml in serum/conjugate 
diluent. 

Other Solutions: 

[0496] Washing solution R2, solutions for visualization 
with TMB R8, diluent, R9 chromogen, R10 stopping solu- 
tion: reagents marketed by Bio-Rad (e.g. Platelia pylori kit, 
ref 72778). 
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b) Procedure 

[0497] 1st step in "prediction" plate 

[0498] Dilute each serum Vs in the prediction plate 

[0499] (48 ul of diluent+12 ul of serum). 

[0500] After having diluted all the sera, distribute 60 ul 
of conjugate. 

[0501] Where appropriate, the serum+conjugate mix is 
left to incubate. 

[0502] 2nd step in "reaction" plate 

[0503] Transfer 100 ul of mixture/well into the reaction 

[0504] Incubation 1 h 37° C. 

[0505] 5 washings in lOx WASHING solution R2 
diluted 10-fold beforehand in demineralized water 
(->lx washing solution) 

[0506] Distribute 200 ul/well of visualization solution 
(to be diluted immediately before use e.g.: 1 ml of R9 
in 10 ml of R8) 

[0507] Incubation 30 min at room temperature and 
protected from light 

[0508] Stop the reaction with 100 ul/weil of RIO 

[0509] READING at 450/620 nm 

[0510] Likewise as for the indirect ELISA test, the results 
can be interpreted using a "threshold value" serum. Any 
serum having a response greater titan the threshold value 
serum will be considered as positive. 



2) Results 

[0511] The sera of patients classified as probable cases of 
SARS from the French hospital of Hanoi, Vietnam or in 
relation with the French hospital of Hanoi (JYK) were 
analyzed using the indirect IgG-N test and the double 
epitope N test. 

[0512] The results of the indirect IgG-N test (FIGS. 14 and 
15) and double epitope N test (FIGS. 16 and 17) show an 
excellent correlation between them and with an indirect 
ELISA test comparing the reactivity of the sera toward a 
lysate of VeroE6 cells infected or not infected with SARS- 
CoV (ELISA-SARS-CoV lysate; see table V below). All the 
sera collected 12 days or more after the onset of the 
symptoms were found to be positive, including in patients 
for whom it had not been possible to document the SARS- 
CoV virus infection by analyzing respiratory samples by 
RT-PCR, probably because of a sample being collected too 
late during the infection (^D12). In the case of the patient 
TTH for whom a nasal sample collected on D7 was found to 
be negative by RT-PCR, the quality of the sample may be in 
question. 

[0513] Some sera were found to be negative whereas the 
presence of SARS-CoV was detected by RT-PCR. They are 
in all cases early sera collected less than 10 days after the 
onset of the symptoms (e.g.: serum ft 032637). In the case of 
a patient PTTH (serum # 032673), only a suspicion of SARS 
was raised at the time the samples were collected. 

[0514] In conclusion, the indirect IgG-N and N-double 
epitope serological tests make it possible to document the 
SARS-CoV infection in all the patients for the sera collected 
12 days or more after the infection. 



033168 r 
033597 T 
032552 V 

032544 C 

032546 C 

032548 V 

032550 N 



17 KEG 

D17&D21 
17 NEG-D17&D21 



032554 N 

032555 N 
032564 N 



12 NEG 

D7&D12&D16 
17 NEG 

D17&D21 
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TABLE V-continued 



Results of the ELISA tests 



Num Patient Day (1 



032648 NNT 

032649 PTH 

032672 LW 

032673 PTTH 

032674 PNB 

032682 VTH 

032683 DTV 



5&D19 
!G 

.7&D21 

3G 

.6&D20 

5G 

5G 

7&D21 

2&D16 
5G 

7&D21 



Remarks: 

[0515] (1): The RT-PCR analyses were carried out by 
nested RT-PCR BNI, LC Artus and LC-N on nasal or 
pharyngeal swabs; POS means that at least one sample was 
found to be positive in this patient. 

[0516] (2): The reactivity of the sera in the ELISA test 
using a lysate of cells infected with SARS-CoV was clas- 
sified as very highly reactive (+++), highly reactive (++), 
reactive (+) and negative according to the OD value obtained 
at the dilutions tested. 

EXAMPLE 8 

Detection of SARS-Associated Coronavirus 
(SARS-CoV) by RT-PCR 

1) Real Time Development of RT-PCR Conditions with the 
Aid of Primers Specific for the Gene for the Nucleocapsid 
Protein— "Light Cycler N" Test 
a) Design of the Primers and Probes 

[0517] The primers and probes were designed from the 
sequence of the genome of the SARS-CoV strain derived 
from the sample recorded under the number 031589, with 
the aid of the programme "Light Cycler Probe Design 
(Roche)". Thus, the following two series of primers and 
probes were selected: 



series 1 (SEQ ID NO: 60, 61, 64, 65): 
sense primer: N/+/28507: 
5'-GGC ATC GTA TGG GTT G-3' 
[28507-28522] 




5'-GGC ACC CGC AAT OCT AAT AAC AAT GC- 

fluorescein 3' 

[28561-28586] 

5' Red705-GCC ACC GTG CTA CAA CTT CCT-phosphate 
[28588-28608] 



-continued 




[28702-28687] 
probe 1: SARS/N/FL: 

5 ' -ATA CAC CCA AAG ACC ACA TTG GC-fluorescein 3 ' 
[28541-28563] 

5' Red705-CCC GCA ATC CTA ATA ACA ATG CTG C- 

phosphate 3' 

[28565-28589] 

b) Analysis of the Efficacy of the Two Primer Pairs 

[0518] In order to test the respective efficacy of the two 
pairs of primers n> RT-PCR a pi fi t >n was carried out 
on a synthetic RNA corresponding to nucleotides 28054- 
29430 of the genome of the SARS-CoV strain derived from 
the sample recorded under the number 031589 and contain- 
ing the sequence of the N gene. 

[0519] More specifically: 

[0520] This synthetic RNA was prepared by in vitro 
transcription with the aid of the T7 phage RNA polymerase, 
of a DNA template obtained by linearization of the plasmid 
SRAS-N with the enzyme Bam HI . After eliminating the 
DNA template by digestion with the aid of DNAse 1 , the 
synthetic RNAs are purified by a phenol-chloroform extrac- 
tion, followed by two successive precipitations in ammo- 
nium acetate and isopropanol. They are then quantified by 
measuring the absorbance at 260 nm and their quality is 
checked by the ratio of the absorbances at 260 and 280 nm 
and by agarose gel electrophoresis. Thus, the concentration 
of the synthetic RNA preparation used for these studies is 
1.6 mg/ml, which corresponds to 2.1xl0 15 copies/ml of 
RNA. 

[0521] Decreasing quantities of synthetic RNA were 
amplified by RT-PCR with the aid of the "Superscript™ 
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One-Step RT-PCR with Platinum® Taq" kit and the pairs of 
primers No. 1 (N/+/28507, N/-/28774) (FIG. 1A) and No. 2 
(N/+/28375, N/-/28702) (FIG. IB), according to the sup- 
plier's instructions. The amplification conditions used are 
the following: the cDNA was synthesized by incubation for 
30 min at 45° C., 15 min at 55° C. and then 2 min at 94° C. 
and it was then amplified by 5 cycles comprising: a step of 
denaturation at 94° C. for 15 sec, a step of annealing at 45° 
C. for 30 sec and then a step of extension at 72° C. for 30 
sec, followed by 35 cycles comprising: a step of denatur- 
ation at 94° C. for 15 sec, a step of annealing at 55° C. for 
30 sec and then a step of extension at 72° C. for 30 sec, with 
2 sec of additional extension at each cycle, and a final step 
of extension at 72° C. for 5 min. The amplification products 
obtained were then kept at 10° C. 

[0522] The results presented in FIG. 11 show that the pair 
of primers No. 2 (N/+/28375, N/-/28702) makes it possible 
to detect up to 10 copies of RNA (band of weak intensity) 
or 10 2 copies (band of good intensity) against 10 4 copies for 
the pair of primers No. 1 (N/+/28507, N/-/28774). The 
amplicons are respectively 268 bp (pair 1) and 328 bp (pair 
2). 

c) Development of Real Time RT-PCR 
[0523] A real time RT-PCR was developed with the aid of 
the pair of primers No. 2 and of the pair of probes consisting 
of SRAS/N/FL and SRAS/N/LC705 (FIG. 2). 
[0524] The amplification was carried out on a LightCy- 
cler™ (Roche) with the aid of the "Light Cycler RNA 
Amplification Kit Hybridization Probes" kit (reference 2 
015 145, Roche) under the following optimized conditions. 
A reaction mixture containing: H 2 0 (6.8 ul), 25 mM MgCl 2 
(0.8 ul, 4 uM Mg2+ final), 5x reaction mixture (4 ul), 3 um 
probe SRAS/N/FL (0.5 ul, 0.075 uM final), 3 uM probe 
SRAS/N/LC705 (0.5 ul, 0.075 uM final), 10 uM primer 
N/+/28375 (1 ul, 0.5 uM final), 10 uM primer N/-/28702 (1 
ul, 0.5 uM final), enzyme mixture (0.4 ul) and sample (viral 
RNA, 5 ul) was amplified according to the following pro- 



RT-PCR according to the protocol described above; the 
analysis presented in FIG. 12 shows that this virus stock 
contains 6.5xl0 9 genome-equivalents/ml (geq/ml), which is 
entirely similar to the 1 .OxlO 10 geq/ml value measured with 
the aid of the "RealArt™ HPA-Coronavirus LC RT PCR 
Reagents" kit marketed by Artus. 

2) Development of Nested RT-PCR Conditions Targeting the 
Gene for RNA Polymerase— "CDC (Centers for Disease 
Control and Prevention)/IP Nested RT-PCR" Test 

a) Extraction of the Viral RNA 

[0526] Clinical sample: QIAmp viral RNA Mini Kit 
(QIAGEN) according to the manufacturer's instructions, or 
an equivalent technique. The RNA is eluted in a volume of 
60 ul. 

b) "SNE/SAR" Nested RT-PCR 

First Step: "SNE" Coupled RT-PCR 

[0527] The Invitrogen "Superscript™ One-Step RT-PCR 
with Platinum*® Taq" kit was used, but the "Titan" kit from 
Roche Boehringer can be used in its place with similar 



Oligonucleotides : 
SHE-SI 

5' GGT TGG GAT TAT CCA AAA TGT GA 3' 
SNE-AS1 

5 ' GCA TCA TCA GAA AGA ATC ATC ATG 3 ' 
-» Expected size: 440 bp 

[0528] 1. Prepare a mix: 



H20 6.5 ul 

Reaction mix 2X 12.5 ul 

Oligo SNE-S1 50 uM 0.2 ul 




[0525] The results presented in FIG. 12 show that this real 
time RT-PCR is very sensitive since it makes it possible to 
detect 102 copies of synthetic RNA in 100% of the 5 
samples analyzed (29/29 samples in 8 experiments) and up 
to 10 copies of RNA in 100% of the 5 samples analyzed 
(40/45 samples in 8 experiments). It also shows that this 
RT-PCR makes it possible to detect the presence of the 
SARS-CoV genome in a sample and to quantify the number 
of genomes present. By way of example, the viral RNA of 
a SARS-CoV stock cultured on Vera E6 cells was extracted 
with the aid of the "Qiamp viral RNA extraction" kit 
(Qiagen), diluted to 0.05xl0~ 14 and analyzed by real time 



-continued 



Oligo SNE-AS1 50 uM 0.2 ul 

RNAsin 40 U/ul 0.12 ul 

RT/Platinum Taq mix 0.5 ul 



[0529] 2. To 20 ul of the mix, add 5 ul of RNA and carry 
out the amplification on a thermocycler (ABI 9600 condi- 
tions): 



US 2007/0275002 Al 



33 



Nov. 29, 2007 



[0531] Synthetic RNAs served as positive control. As the 
control, 1 0 3 , 1 0 2 and 1 0 copies of synthetic RNA R SNE were 
amplified in each experiment. 

[0532] Second Step: "SAR" Nested PCR 



tides: 

TTG TTC TTG CTC GCA 3 



G CCA CAC ATG 3 1 



-> Expected size 
[0533] 1 . Prepare a mix: 



H20 

Taq buffer 10X 
MgCl, 25 mM 
Mix dNTPs 5 mM 
Oligo SAR1-S 50 uM 
Oligo SARI -AS 50 uM 
TaqDNApolSUVul 



[0538] 4. The fragments can then be purified on QIAquick 
PCR kit (QIAGEN) and sequenced with the oligos SAR1-S 
and SARI -AS. 

3) Detection of the SARS-CoV RNA by PCR from Respi- 
ratory Samples 
a) First Comparative Study 

[0539] A comparative study was carried out on a series of 
respiratory samples received by the National Reference 
Center for the Influenza Virus (Northern region) and likely 
to contain SARS-CoV. To do this, the RNA was extracted 
from the samples with the aid of the "Qiamp viral RNA 
extraction" kit (Qiagen) and analyzed by real time RT-PCR, 
on the one hand with the aid of the pairs of primers and 
probes of the No. 2 series under the conditions described 
above on the one hand, and on the other hand with the aid 
of the kit "LightCycler SARS-CoV quantification kit" mar- 
keted by Roche (reference 03 604 438). The results are 
summarized in table VI below. They show that 18 of the 26 
samples are negative and 5 of the 26 samples are positive for 
the two kits, while one sample is positive for the Roche kit 
alone and two for the "series 2" N reagents alone., Addi- 
tionally, for 3 samples (20032701 , 20032712, 20032714) the 
quantities of RNA detected are markedly higher with the 
reagents (probes and primers) of the No. 2 series. These 
results indicate that the "series 2" N primers and probes are 
more sensitive for the detection of the SARS-CoV genome 
in biological samples than those of the kit currently avail- 

TABLE VI 

Real time RT-PCR analysis of the RNAs 

the aid of the pairs of primers P and probes of the No. 2 
series ("series 2" N) or of the kit "Lightcycler SARS- 
CoV quantification kit" (Roche). The type of sample is 
indicated as well as the number of copies of viral 
genome measured in each of the two tests. NEG: negative 
RT-PCR. 



ROCHE 



[0534] AmpliTaq DNA Pol from Applied Biosystems was 
used (lOx buffer without MgCl 2 , ref 27216601). 

[0535] 2. To 48 ul of the mix, add 2 ul of the product from 
the first PCR and carry out the amplification (ABI 9600 
conditions): 



x35 cycles 



[0536] 3. Analyze 10 ul of the reaction product on "low- 
melting" gel (Seakem GTG type) containing 3% agarose. 

[0537] The sensitivity of the nested test is routinely, under 
the conditions described, 10 copies of RNA. 



20033082 
20033083 
20033086 
20033087 

20032803 
20032806 
20031 746ARN2 
20032711 
20032910 
20032911 
20033356 
20033357 

20032657 
20032698 
20032720 
20033074 
20032701 
20032702 
20031 747 ARN2 
20032712 

20032800 

20033384 



nasal or pharyngeal 

pharyngeal 

nasal or pharyngeal 
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b) Second Comparative Study 

[0540] The performance of various nested RT-PCR and 
real time RT-PCR methods were then compared for 121 
respiratory samples from possible cases of SARS at the 
French hospital in Hanoi, Vietnam, taken between the 4th 
and the 17th day after the onset of the symptoms. Among 
these samples, 14 were found to be positive during a first test 
using the nested RT-PCR method targeting ORFlb (encod- 
ing replicase) as described initially by Bernhard Nocht 
Institute (BNI nested RT-PCR). Information relating to this 
test is available on the internet, at the address http:// 
wwwl5.bni-hamburg.de/bni2/neu2/getfile.acgi?area_engl = 
diagnostics&pid=4 112. 

[0541] The various tests compared in this study are: 

[0542] the quantitative RT-PCR method according to 
the invention, with the "series 2" N primers and probes 
described above (LightCycler N column), 

[0543] the nested RT-PCR test targeting the RNA poly- 
merase gene described above, developed by the CDC, 
BNI and Institut Pasteur (CDC/IP nested RT-PCR), 

[0544] the ARTUS kit with the reference "HPA Corona 
LC RT-PCR Kit # 5601-02", which is a real time 
RT-PCR test targeting the ORFlb gene, 

[0545] the BNI nested RT-PCR test, also targeting the 
RNA polymerase gene mentioned above. 

[0546] The inventors observed: 

[0547] 1) an inter-test variability for the same technique, 
linked to the degradation of the RNA preparation during 



repeated thawing, in particular for the samples containing 
the lowest quantities of RNA, 

[0548] 2) a reduced sensitivity of the CDC/IP nested 
RT-PCR compared with the BNI nested RT-PCR, and 
[0549] 3) a comparable sensitivity of the quantitative 
RT-PCR test according to the invention (LightCycler N) 
compared with the Arrus LightCycler (LC) test. 
[0550] These results, which are presented in table VII 
below, show that the quantitative RT-PCR test according to 
the invention constitutes an excellent addition — or an alter- 
native—to the tests currently available. Indeed, the SARS- 
linked coronavirus is an emergent virus which is capable of 
changing rapidly. In particular, the gene for the RNA poly- 
merase of the SARS-linked coronavirus, which is targeted in 
most of the tests currently available, can recombine with that 
of other coronaviruses not linked to SARS. The use of a test 
targeting this gene exclusively could then lead to the pro- 
duction of false-negatives. 

[0551] The quantitative RT-PCR test according to the 
invention does not target the same genomic region as the 
ARTUS kit since it targets the gene encoding the N protein. 
By carrying out a diagnostic test targeting two different 
genes of the SARS-linked coronavirus, it can therefore be 
hoped to avoid false-negative type results which could be 
due to the genetic evolution of the virus. 

[0552] Furthermore, it appears particularly advantageous 
to target the gene for the nucleocapsid protein because it is 
very stable because of the high selection pressure linked to 
the high structural constraints regarding this protein. 



TABLE VII 



gene amplification, from 121 samples of probable c 
of SARS at the French hospital in Hanoi, Vjetaa 
(epidemic 2003) 

mple Sample CDC/IP 
ype collection nested 
(1) day Patient RT-PCR 



Light 
Cycler 



Light 
Cycler 
NOP) 



032530 
032531 
032534 



032690 
032727 
032728 
032729 
032730 
032741 



NHH Negative Positive 



NVH Negative Positive 



Positive Positive 



NVH Positive 



3.10E+01 4.20E+01 



Negative 0.30E+02 

Negative Negative 

Negative Negative 

1.20E+01 2.30E+02 

1.60E+00 Negative 

2.30E+O2 4.00E+02 

1.10E+03 1.60E+04 

5.90E+00 3.40E+01 

1.30E+02 4.80E+02 

2.10E+02 1.30E+02 



71.4% 



(1) P = pharyngeal swab N = nasal swab 
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Production and Characterization of Monoclonol 
Antibodies Directed Against the N Protein 

[0553] Balb C mice were immunized with the purified 
recombinant N protein and their spleen cells fused with an 
appropriate murine myeloma according to the Kohler and 
Milstein techniques. 

[0554] Nineteen anti-N antibody secreting hybridomas 
d and their ir 



These antibodies do indeed recognize the recombinant N 
protein (in ELISA) with variable intensities, and the natural 
viral N protein in ELISA and/or in Western blotting. FIGS. 
18 to 20 show the results of these tests for 15 of these 19 
monoclonal antibodies. 

[0555] The highly reactive clones 12, 17, 28, 57, 72, 76, 
86, 87, 98, 103, 146, 156, 166, 170, 199, 212, 218, 219 and 
222 were subcloned. Specificity studies were carried out 
with the appropriate tools in order to determine the epitopes 
recognized and verify the absence of reactivity toward other 
human coronaviruses and certain respiratory viruses. 

[0556] Epitope mapping studies (performed on spot mem- 
brane with the aid of overlapping peptides of 15 aa) and 
additional studies performed on the natural N protein in 
Western blotting revealed the existence of 4 groups of 
monoclonal antibodies: 

[0557] 1 . Monoclonal antibodies specific for a major lin- 
ear epitope at the N-ter position (75-81, sequence: INT- 
NSVP). 

[0558] The representative of this group is antibody 156. 
The hybridoma producing this antibody was deposited at the 
Collection Nationale de Cultures de Microorganismes 
(CNCM) of the Institut Pasteur (Paris, France) on Dec. 1, 
2004, under the number 1-3331. This same epitope is also 
recognized by a rabbit serum (anti-N polyclonal) obtained 
by conventional immunization with the aid of this same N 
protein. 



[0559] 2. Monoclonal antibodies specific for a major lin- 
ear epitope located in a central position (position 217-224, 
sequence: ETALALL); the representatives of this group are 
the monoclonal antibodies 87 and 166. The hybridoma 
producing antibody 87 was deposited at the CNCM on Dec. 
1, 2004, under the number 1-3328. 

[0560] 3. Monoclonal antibodies specific for a major lin- 
ear epitope located at the C-terminal position (position 
403-408, sequence: DFFRQL), the representatives of this 
group are the antibodies 28, 57 and 143. The hybridoma 
producing antibody 57 was deposited at the CNCM on Dec. 
1, 2004, under the number 1-3330. 

[0561] 4. Monoclonal antibodies specific for a discontinu- 
ous conformational epitope. This group of antibodies does 
not recognize any of the peptides spanning the sequence of 
the N protein, but react strongly on the non-denatured 
natural protein. The representative of this final group is the 
antibody 86. The hybridoma producing this antibody was 
deposited at the CNCM on Dec. 1, 2004. under the number 
1-3329. 



Antibody Epitope 



LPQRQ 

ETALALLiI 
ETALALL 
INTNSGP 



[0563] In addition, as illustrated in particular in FIGS. 18 
and 19, these antibodies exhibit no reactivity in ELISA 
and/or in WB toward the N protein of the human corona- 



EXAMPLE 10 

Combinations of the Monoclonal Antibodies for the 
Development of a Sensitive Immunocapture Test 
Specific for the Viral N Antigen in the Semm or 
Biological Fluids of Patients Infected with the 
SARS-CoV Virus 

[0564] The antibodies listed below were selected because 
of their very specific properties for an additional capture and 
detection study of the viral N protein, in the serum of the 
subjects or patients. 

[0565] These antibodies were produced in ascites on mice, 
purified by affinity chromatography and used alone or in 
combination, as capture antibodies and as signal antibodies. 

[0566] List of the antibodies selected: 

[0567] Ab anti-C-ter region (No. 28, 57, 143) 

[0568] Ab anti -central region (No. 87, 166) 

[0569] Ab anti -N-ter region (No. 156) 

[0570] Ab anti-discontinuous conformational epitope 
(86) 

1) Preparation of the Reagents: 

a) Immunocapture ELISA Plates 

[0571] The plates are sensitized with the antibody solu- 
tions at 5 u.g/ml in 0.1 M carbonate buffer, pH 9.6. The 
(monovalent or plurivalent) solutions are deposited in a 
volume of 100 |il in the wells and incubated overnight at 
room temperature. These plates are then washed with PBS 
buffer (10 mM pH 7.4 supplemented with 0.1% Tween 20) 
and then saturated with a PBS solution supplemented with 
0.3% BSA and 5% sucrose). The plates are then dried and 
then packaged in a bag in the presence of a desiccant. They 
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b) Conjugates 

[0572] The purified antibodies were coupled with peroxi- 
dase according to the Nakane protocol (Nakane et al. — 
1974, J. of Histo and cytochemistry, vol. 22, pp. 1084-1091) 
in a ratio of one molecule of IgG per 3 molecules of 
peroxidase. These conjugates were purified by exclusion 
chromatography and stored concentrated (concentration 
between 1 and 2 mg/ml) in the presence of 50% glycerol and 
at -20° C. They are diluted for their use in the assays at the 
final concentration of 1 or 2 ug/ml in PBS buffer (pH 7.4) 
supplemented with 1% BSA. 

c) Other Reagents 

[0573] Human sera negative for all the serum markers 
for the HIV, HBV, HCV and THLV viruses Pool of 
negative human sera supplemented with 0.5% Triton X 
100 

[0574] Inactivated viral Ag: viral culture supernatant 
inactivated by irradiation and inactivation verified after 
placing in culture on sensitive cells — titer of the sus- 
pension before inactivation about 10 7 infectious par- 
ticles per ml or alternatively about 5xl0 9 physical viral 
particles per ml of antigen 

[0575] The Ag samples diluted in negative human 
serum: these samples were prepared by diluting 1:100 
and then by 5-fold serial dilution. 

[0576] These noninfectious samples mimic human 
samples thought to contain low to very low concentra- 
tions of viral nucleoprotein N. Such samples are not 
available for routine work. 

[0577] Washing solution R2, solution for visualization 
TMB R8, chromogen R9 and stop solution R10, are the 
generic reagents marketed by Bio-Rad in its ELISAkits 
(e.g.: Platelia pylori kit ref. 72778). 

2) Procedure 

[0578] The samples of human sera overloaded with inac- 
tivated viral Ag are distributed in an amount of 100 ul per 
well, directly in the ready-to-use sensitized plates, and then 
incubated for 1 hour at 37° C. (Bio-Rad IPS incubation). 
[0579] The material not bound to the solid phase is 
removed by 3 washings (washing with dilute R2 solution, 
automatic LP 35 washer). 

[0580] The appropriate conjugates, diluted to the final 
concentration of 1 or 2 ug/ml, are distributed in an amount 
of 100 ul per well and the plates are again incubated for one 
hour at 37° C. (IPS incubation). 

[0581] The excess conjugate is removed by 4 successive 
washings (dilute R2 solution— LP 35 washer). 
[0582] The presence of conjugate attached to the plates is 
visualized after adding 100 ul of visualization solution 
prepared before use (1 ml of R9 and 10 ml of R8) and after 
incubation for 30 minutes, at room temperature and pro- 
tected from light. 

[0583] The enzymatic reaction is finally blocked by add- 
ing 100 ul of R10 reagent (1 N H 2 SOJ to all the wells. 

[0584] The reading is carried out with the aid of an 
appropriate microplate reader at double wavelength (450/ 
620 nm). 



[0585] The results can be interpreted by using, as provi- 
sional threshold value, the mean of at least two negative 
controls multiplied by a factor of 2 or alternatively die mean 
of 100 negative sera supplemented with an increment cor- 
responding to 6 SD (standard deviation calculated on the 
100 individual measurements). 

3) Results 

[0586] Various capture antibody and signal antibody com- 
binations were tested based on the properties of the anti- 
bodies selected, and avoiding the combinations of antibodies 
specific for the same epitopes in solid phase and as conju- 
gates. 

[0587] The best results were obtained with the 4 combi- 
nations listed below. These results are reproduced in table IX 

1. Combination F/28 

[0588] Solid phase (Ab 166+87 central region): conjugate 
antibody 28 (C-ter) 

2. Combination G/28 

[0589] Solid phase (Ab 86 — conformational epitope): 
conjugate antibody 28 (C-ter) 

3. Combination H/28 

[0590] Solid phase (Ab 86, 166 and 87 central region and 
conformational epitope): conjugate antibody 28 (C-ter) 

4. Combination H/28+87 

[0591] Solid phase (Ab 86, 166 and 87 central region and 
conformational epitope): mixed conjugate antibodies 28 
(C-ter) and 87 (central) 

5. Combination G/87 

[0592] Solid phase (Ab 86— conformational epitope): 
conjugate antibody 87 (central region) 

[0593] The first 4 combinations exhibit equivalent and 
reproduced performance levels, greater than the other com- 
binations used (such as for example the combination G/87). 
Of course, in these combinations, a monoclonal antibody 
may be replaced with another antibody recognizing the same 
epitope. Thus, the following variants may be mentioned: 

6. Variant of the Combination F/28 

[0594] Solid phase (Ab 87 only): conjugate antibody 57 
(C-ter) 

7. Variant of the Combination G/28 

[0595] Solid phase (Ab 86— conformational epitope): 
conjugate antibody 57 (C-ter) 

8. Variant of the Combination H/28 

[0596] Solid phase (Ab 86 and 87 central region and 
conformational epitope): conjugate antibody 57 (C-ter) 

9. Variant of the Combination H/28+87 

[0597] Solid phase (Ab 86 and 87 central region and 
conformational epitope): mixed conjugate antibodies 57 
(C-ter) and 87 (central) 
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V5 500 
Va 500 
Va 500 
M,2 500 



0.480 
0.240 
0.186 
0.193 



[0598] The detection limit for these 4 experimental trials 
corresponds to the antigen dilution in negative serum 1 :62 
500. A rapid extrapolation suggests the detection of less than 
10 3 infectious particles per ml of sera. 
[0599] From this study, it is evident that the most appro- 
priate antibodies for the capture of the native viral nucle- 
oprotein are the antibodies specific for the central region 
and/or for a conformational epitope, both being antibodies 
also selected for their high affinity for the native antigen. 
[0600] Having determined the best antibodies for the 
composition of the solid phase, the antibodies to be selected 
as a priority for the detection of the antigens attached to the 
solid phase are the complementary antibodies specific for a 
dominant epitope in the C-ter region. The use of any other 
complementary antibody specific for epitopes located in the 
N-ter region of the protein leads to average or poor results. 

EXAMPLE 11 

Eukaryotic Expression Systems for the 
SARS-Associated Coronavirus (SARS-CoV) 
spicule (S) Protein 



[0601] The conditions for transient expression of the 
SARS-CoV spicule (S) protein were optimized in mamma- 
lian cells (293T, VeroE6). 

[0602] For that, a DNA fragment containing the cDNA for 
SARS-CoV S was amplified by PCR with the aid of the 
oligo-nucleotides 5'-ATAGGATCCA CCATGTTTAT 
TTTCTTATTA TTTCTTACTC TCACT-3' and 5'-ATACTC- 
GAGTT ATGTGTAATG TAATTTGACA CCCTTG-3' from 
the plasmid pSARS-S(C.N.C.M. No. 1-3059) and then 
inserted between the BamHI and Xhol sites of the plasmid 
pTRIPAU3-CMV containing a lentiviral vector TRIP (Sir- 
ven, 2001, Mol. Ther., 3, 438-448) in order to obtain the 
plasmid pTRIP-S. The BamHI and Xhol fragment contain- 
ing the cDNA for S was then subcloned between BamH 1 and 
Xhol of the eukaryotic expression plasmid pcDNA3.1(+) 
(Clontech) in order to obtain the plasmid pcDNA-S. The 
Nhel and Xhol fragment containing the cDNA for S was 
then subcloned between the corresponding sites of the 
expression plasmid pCI (Promega) in order to obtain the 
plasmid pCI-S. The WPRE sequences of the woodchuck 
hepatitis virus ("Woodchuck Hepatitis Virus posttranscrip- 



tional regulatory element") and the CTE sequences ("con- 
stitutive transport element") of the simian retro-virus from 
Mason-Pfizer were inserted into each of the two plasmids 
pcDNA-S and pCI-S between the Xhol and Xbal sites in 
order to obtain respectively the plasmids pcDNA-S-CTE, 
pcDNA-S-WPRE, pCI-S-CTE and pCI-S-WPRE (FIG. 21). 
The plasmid pCI-S-WPRE was deposited at the CNCM, on 
Nov. 22, 2004, under the number 1-3323 . All the inserts were 
sequenced with the aid of a BigDye Terminator vl.l kit 
(Applied Biosy stems) and an automated sequencer ABI377. 

[0603] The capacity of the plasmid constructs to direct the 
expression of SARS-CoV S in mammalian cells was 
assessed after transfection of VeroE6 cells (FIG. 22). In this 
experiment, monolayers of 5x10 s VeroE6 cells in 35 mm 
Petri dishes were transfected with 2 ug of plasmids pcDNA 
(as control), pcDNA-S, pCI and pCI-S and 6 ul of Fugene6 
reagent according to the manufacturer's instructions 
(Roche). After 48 hours of incubation at 37° C. and under 
5% C0 2) cellular extracts were prepared in loading buffer 
according to Laemmli, separated on 8% SDS polyacryla- 
mide gel, and then transferred onto a PVDF membrane 
(BioRad). The detection of this immunoblot (Western blot) 
was carried out with the aid of an anti-S rabbit polyclonal 
serum (immune serum from the rabbit P11135: cf. example 

4 above) and donkey polyclonal antibodies directed against 
rabbit IgGs and coupled with peroxidase (NA934V, Amer- 
sliam). The bound antibodies were visualized by lumines- 
cence with the aid of the ECL+ kit (Amersham) and auto- 
radiography films Hyperfilm MP (Amersharn). 

[0604] This experiment (FIG. 22) shows that the plasmid 
pcDNA-S does not make it possible to direct the expression 
of SARS-CoV S at detectable levels whereas the plasmid 
pCI-S allows a weak expression, close to the limit of 
detection, which may be detected when the film is overex- 
posed. Similar results were obtained when the expression of 

5 was sought by hnmunofluorescence (data not shown). This 
impossibility to detect effective expression of S cannot be 
attributed to the detection techniques used since the S 
protein can be detected at the expected size (180 kDa) in an 
extract of cells infected with SARS-CoV or in an extract of 
VeroE6 cells infected with the recombinant vaccinia virus 
W-TF7.3 and transfected with the plasmid pcDNA-S. In 
this latter experiment, the virus W-TF7.3 expresses the 
RNA polymerase of the T7 phage and allows the cytoplas- 
mic transcription of an uncapped RNA capable of being 
efficiently translated. This experiment suggests that the 
expression defects described above are due to an intrinsic 
inability of the cDNA for S to be efficiently expressed when 
the step for transcription to messenger RNA is carried out at 
the nuclear level. 

[0605] In a second experiment, the effect of the CTE and 
WPRE signals on the expression of S was assessed after 
transfection of VeroE6 (FIG. 23A) and 293T (FIG. 23B) 
cells and according to a protocol similar to that described 
above. Whereas the expression of S cannot be detected after 
transfection of the plasmids pcDNA-S-CTE and pcDNA-S- 
WPRE derived from pcDNA-S, the insertion of the WPRE 
and CTE signals greatly improves the expression of S in the 
context of the expression plasmid pCI-S. 

[0606] To specify this result, a second series of experi- 
ments were carried out where the immunoblot is quantita- 
tively visualized by luminescence ' 
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digital imaging device (Fluor S, BioRad). The analysis of the 
results obtained with the QuantityOne v4.2.3 software (Bio- 
Rad) shows that the WPRE and CTE sequences increase 
respectively the expression of S by a factor of 20 to 42 and 
10 to 26 in Vero E6 cells (table X). In 293T cells (table X), 
the effect of the CTE sequence is more moderate (4 to 5 
times) whereas that of the WPRE sequence remains high (13 
to 28 times). 

TABLE X 

Quantitative analysis of the effect of the CTE 
and WPRE signals on the expression of SARS-CoV S: 

Cellular extracts were prepared 48 hours after 
transfection of VeroE6 or 293T cells with the plasmid 
pCI, pCI-S, pCI-S-CTE and pCI-S-WPRE and analyzed by 
Western blotting as described in the legend to 
FIG. 22. The Western blot is visualized by 
luminescence (ECL+, Amersham) and acquisition on a 
digital imaging device (FluorS, BioRad). The expression 



after transfection of the plasr 
Two independent experiments were c 
of the two cell types. In experiment 
the transfections were carried out in 



pCI-S 

pCI-S-CTE 

pCI-S-WPRE 

PCI 

PCI-S 

PCI-S-CTE 

PCI-S-WPRE 



[0607] In summary, all these results show that the expres- 
sion, in mammalian cells, of the cDNA for the SARS-CoV 
S under the control of the RNA polymerase II promoter 
sequences requires, to be efficient, the expression of a splice 
signal and of either of the sequences WPRE and CTE. 



[0608] The cDNA for the SARS-CoV S protein was 
cloned in the form of a BamHl-Xhol fragment into the 
plasmid pTRIPAU3-CMV containing a defective lentiviral 
vector TRIP with central DNA flap (Sirven et al., 2001, Mol. 
Ther., 3: 438-448) in order to obtain the plasmid pTRIP-S 
(FIG. 24). Transient cotransfection according to Zennou et 
al. (2000, Cell, 101: 173-185) of this plasmid, of an encapsi- 
dation plasmid (p8.2) and of a plasmid for expression of the 
VSV envelope glycoprotein G (pHCMV-G) in 293T cells 
allowed the preparation of retroviral pseudoparticles con- 
taining the vector TR1P-S and pseudotyped with the enve- 
lope protein G. These pseudotyped TRIP-S vectors were 
used to translate 293T and FRhK-4 cells: no expression of 
the S protein could be detected by Western blotting and 
immunofluorescence in the transduced cells (data not pre- 
sented). 

[0609] The optimum expression cassettes consisting of the 
CMV virus immediate/early promoter, a splice signal, cDNA 
for S and either of the posttranscriptional signals WPRE or 
CTE described above were then substituted for the EFla- 



EGFP cassette of the defective lentiviral expression vector 
with central DNA flap TRIPAU3-EFlct (Sirven et al., 2001, 
Mol. Ther., 3: 438-448) (FIG. 25). These substitutions were 
carried out by a series of successive subclonings of the S 
expression cassettes which were excised from the plasmids 
pCT-S-CTE (Bglll-Apal) or respectively pCI-S-WPRE 
(Bglll-Sall) and then inserted between the Mlul and Kpnl 
sites or respectively Mlul or Xhol sites of the plasmid 
TRIPAU3-EFlct in order to obtain the plasmids pTRIP-SD/ 
SA-S-CTE and pTRIP-SD/SA-S-WPRE, deposited at the 
CNCM, on Dec. 1, 2004, under the numbers 1-3336 and 
1-3334, respectively. Pseudotyped vectors were produced 
according to Zennou et al. (2000, Cell, 101: 173-185) and 
used to transduce 293T cells (1 0 000 cells) and FRhK-4 cells 
(15 000 cells) according to a series of 5 successive trans- 
duction cycles with a quantity of vectors corresponding to 
25 ng (TRIP-SD/SA-S-CTE) or 22 ng TRIP-SD/SA-S- 
WPRE) ofp24 per cycle. 

[0610] The transduced cells were cloned by limiting dilu- 
tion and a series of clones were qualitatively analyzed for the 
expression of SARS-CoV S by immunofluorescence (data 
not shown), and then quantitatively by Western blotting 
(FIG. 25) with the aid of an anti-S rabbit polyclonal serum. 
The results presented in FIG. 25 show that clones 2 and 15 
of FrhK4-s-CTE cells transduced with TRIP-SD/SA-S-CTE 
and clones 4, 9 and 12 of FRhK4-S-WPRE cells transduced 
with TRIP-SD/SA-S-WPRE allow the expression of the 
SARS-CoV S at respectively low or moderate levels if they 
are compared to those which can be observed during infec- 
tion with SARS-CoV. 

[0611] In summary, the vectors TRIP-SD/SA-S-CTE and 
TRIP-SD/SA-S-WPRE allow the production of stable 
clones of FRhK-4 cells and similarly 293T cells expressing 
SARS-CoV S, whereas the assays carried out with the 
"parent" vector TRIP-S remained unsuccessful, which dem- 
onstrates the need for a splice signal and for either of the 
sequences CTE and WPRE for the production of stable cell 
clones expressing the S protein. 

[0612] In addition, these modifications of the vector TRIP 
(insertion of a splice signal and of a post-transcriptional 
signal like CTE and WPRE) could prove advantageous for 
improving the expression of other cDNAs than that for S. 

[0613] 3) Production of stable lines allowing the expres- 
sion of a soluble form of SARS-CoV S. Purification of this 
recombinant antigen. 

[0614] A cDNA encoding a soluble form of the S protein 
(Ssol) was obtained by fusing the sequences encoding the 
ecto-domain of the protein (amino acids 1 to 1193) with 
those of a tag (FL AG : DYKDDDDK) via a BspEl linker 
encoding the SG dipeptide. Practically, in order to obtain the 
plasmid pcDNA-Ssol, a DNA fragment encoding the 
ectodomain of SARS-CoV S was amplified by PCR with the 
aid of the oligonucleotides 5'-ATAGGATCCA CCATGTT- 
TAT TTTCTTATTA TTTCTTACTC TCACT-3' and 5'-AC- 
CTCCGGAT TTAATATATT GCTCATATTT TCCCAA-3' 
from the plasmid pcDNA-S, and then inserted between the 
unique BamHl and BspEl sites of a modified eukaryotic 
expression plasmid pcDNA3.1(+) (Clontech) containing the 
tag sequence FLAG between its BamHl and Xhol sites: 
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[0615] The Nhel-Xhol and BamHl-Xhol fragments, 
containing the cDNA for S, were then excised from the 
plasmid pcDNA-Ssol, and subcloned between the corre- 
sponding sites of the plasmid pTRIP-SD/SA-S-CTE and of 
the plasmid pTRIP-SD-SA-S-WPRE, respectively, in order 
to obtain the plasmids pTRIP-SD/SA-Ssol-CTE and pTRIP- 
SD/SA-Ssol-WPRE, deposited at the CNCM, on Dec. 1, 
2004, under the numbers 1-3337 and 1-3335, respectively. 

[0616] Pseudotyped vectors were produced according to 
Zennou et al. (2000, Cell, 101:173-185) and used to trans- 
duce FRhK-4 cells (15 000 cells) according to a series of 5 
successive transduction cycles (15 000 cells) with a quantity 
of vector corresponding to 24 ng (TRIP-SD/SA-Ssol-CTE) 
or 40 ng (TRIP-SD/SA-Ssol-WPRE) of p24 per cycle. The 
transduced cells were cloned by limiting dilution and a 
series of 16 clones transduced with TRIP-SD/SA-Ssol-CTE 
and of 15 clones with TRIP-SD/SA-Ssol-WPRE were ana- 
lyzed for the expression of the Ssol polypeptide by Western 
blotting visualized with an anti-FLAG monoclonal antibody 
(FIG. 26 and data not presented), and by capture ELISA 
specific for the Ssol polypeptide which was developed for 
this purpose (table XI and data not presented). Part of the 
process for selecting the best secretory clones is shown in 
FIG. 26. Capture ELISA is based on the use of solid phases 
coated with polyclonal antibodies of rabbits immunized with 
purified and inactivated SARS-CoV. These solid phases 
allow the capture of the Ssol polypeptide secreted into the 
cellular supernatants, whose presence is then visualized with 
a series of steps successively involving the attachment of an 
anti-FLAG monoclonal antibody (M2, SIGMA), of anti- 
mouse IgG(H+L) biotinylated rabbit polyclonal antibodies 
(Jackson) and of a streptavidin-peroxidase conjugate (Amer- 
sham) and then the addition of chromogen and substrate 
(TMB+H 2 0 2 , KPL). 

TABLE XI 

Analysis of the expression of the Ssol 
polypeptide by cell lines transduced with the 
Ientiviral vectors TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA- 
Ssol-CTE. The secretion of the Ssol polypeptide was 



isolated after transduction of FRhK-4 cells with the 
entiviral vectors TRIP-SD/SA-Ssol-WPRE and TRIP-SD/SA- 
Ssol-CTE. The supernatants diluted 1/50 were analyzed 
by a capture ELISA test specific for SARS-CoV S. 



CTE2 
CTE3 
CTE9 
CTE12 
CTE13 
WPRE1 



[0617] The cell line secreting the highest quantities of Ssol 
polypeptide in the culture supernatant is the FRhK4-Ssol- 



CTE3 line. It was subjected to a second series of 5 cycles of 
transduction with the vector TRIP-SD/SA-Ssol-CTE under 
conditions similar to those described above and then cloned. 
The subclone secreting the highest quantities of Ssol was 
selected by a combination of Western blot and capture 
ELISA analysis: it is the subclone FRhK4-Ssol-30, which 
was deposited at the CNCM, on Nov. 22, 2004, under the 
name 1-3325. 

[0618] The FRhK4-Ssol-30 line allows the quantitative 
production and purification of the recombinant Ssol 
polypeptide. In a typical experiment where the experimental 
conditions for growth, production and purification were 
optimized, the cells of the FRhK4-Ssol-30 line are inocu- 
lated in standard culture medium (pyruvate-free DMEM 
containing 4.5 g/1 of glucose and supplemented with 5% 
FCS, 100 U/ml of penicillin and 100 ug/ml of streptomycin) 
in the form of a subconfluent monolayer (1 million cells per 
each 100 cm 2 in 20 ml of medium). At confluence, the 
standard medium is replaced with the secretion medium 
where the quantity of FCS is reduced to 0.5% and the 
quantity of medium reduced to 1 6 ml per each 1 00 cm 2 . The 
culture supernatant is removed after 4 to 5 days of incuba- 
tion at 35° C. and under 5% C0 2 . The recombinant polypep- 
tide Ssol is purified from the supernatant by the succession 
of steps of filtration on 0.1 um polyethersulfone (PES) 
membrane, concentration by ultrafiltration on a PES mem- 
brane with a 50 kD cut-off, affinity chromatography on 
anti-FLAG matrix with elution with a solution of FLAG 
peptide (DYKDDDDK) at 100 ug/ml in TBS (50 mM tris, 
pH 7.4, 150 mM NaCl) and then gel filtration chromatog- 
raphy in TBS on sephadex G-75 beads (Pharmacia). The 
concentration of the purified recombinant Ssol polypeptide 
was determined by micro-BCA test (Pierce) and then its 
biochemical characteristics analyzed. 

[0619] Analysis by 8% SDS acrylamide gel stained with 
silver nitrate demonstrates a predominant polypeptide 
whose molecular mass is about 1 80 kD and whose degree of 
purity may be evaluated at 98% (FIG. 27 A). Two main peaks 
are detected by SELDI-TOF mass spectrometry (Cypher- 
gen): they correspond to single and double charged forms of 
a predominant polypeptide whose molecular mass is thus 
determined at 182.6+3.7 kD (FIGS. 27B and C). After 
transfer onto Prosorb membrane and rinsing in 0.1% TFA, 
the N-terminal end of the Ssol polypeptide was sequenced in 
liquid phase by Edman degradation on 5 residues (ABI494, 
Applied Biosystems) and determined as being SDLDR 
(FIG. 27D). This demonstrates that the signal peptide 
located at the N-terminal end of the SARS-CoV S protein, 
composed of aa 1 to 13 (MFIFLLFLTLTSG) according to an 
analysis carried out with the software signalP v2.0 (Nielsen 
et al., 1997, Protein Engineering, 10:1-6), is cleaved from 
e Ssol polypeptide. The recombinant Ssol polypep- 
;ists of amino acids 14 to 1193 of the 
SARS-CoV S protein fused at the C-terminals with a 
sequence SG DYKDPDDK containing the sequence of the 
FLAG tag (underlined). The difference between the theo- 
retical molar mass of the naked Ssol polypeptide (132.0 kD) 
and the real molar mass of the mature polypeptide (182.6 
kD) suggests that the Ssol polypeptide is glycosylated. 

[0620] A preparation of purified Ssol polypeptide, whose 
protein concentration was determined by micro-BCA test, 
makes it possible to prepare a calibration series in order to 
measure, with the aid of the capture ELISA test described 
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above, the concentrations of Ssol present in the culture 
supernatants and to review the characteristics of the secre- 
tory lines. According to this test, the FRhK4-Ssol-CT3 line 
secretes 4 to 6 g/ml of polypeptide Ssol while the FRhK4- 
Ssol-30 line secretes 9 to 13 g/ml of Ssol after 4 to 5 days 
of culture at confluence. In addition, the purification scheme 
presented above makes it possible routinely to purify from 
1 to 2 mg of Ssol polypeptide per liter of culture supernatant. 

EXAMPLE 12 

Gene Immunization Involving the 
SARS-Associated Corona Virus (SARS-CoV) 
Spicule (S) Protein 

[0621] The effect of a splice signal and of the posttran- 
scriptional signals WPRE and CTE was analyzed after gene 
immunization of BALQ/c mice (FIG. 28). 
[0622] For that, BALB/c mice were immunized at inter- 
vals of 4 weeks by injecting into the tibialis anterior a saline 
solution of 50 ug of plasmid DNA of pcDNA-S and pCI-S 
and, as a control, 50 ug of plasmid DNA of pcDNA-N 
(directing the expression of SARS-CoV N) or of pCI-HA 
(directing the expression of the HA of the influenza virus 
A/PR/8/34) and the immune sera collected 3 weeks after the 
2 nd injection. The presence of antibodies directed against the 
SARS-CoV S was assessed by indirect ELISA using as 
antigen a lysate of VeroE6 cells infected with SARS-CoV 
and, as a control, a lysate of noninfected VeroE6 cells. The 
anti-SARS-CoV antibody titers (TI) are calculated as the 
reciprocal of the dilution producing a specific OD of 0.5 
(difference between OD measured on a lysate of infected 
cells and OD measured on a lysate of noninfected cells) after 
visualization with an anti-mouse IgG polyclonal antibody 
coupled with peroxidase (NA931V, Amersham) and TMB 
supplemented with H 2 0 2 (KPL) (FIG. 28A). 
[0623] Under these conditions, the expression plasmid 
pcDNA-S only allows the induction of low antibody titers 
directed against SARS-CoV S in 3 mice out of 6 
(LOG 10 (TI)=1.9±0.6) whereas the plasmid pcDNA-N 
allows the induction of anti-N antibodies at high titers 
(LOG ;0 (TI)=3.9±0.3) in all the animals, and the control 
plasmids (pCI, pCI-HA) do not result in any detectable 
antibody (LOG 10 (TI)<1.7). The plasmid pCI-S equipped 
with a splice signal allows the induction of antibodies at high 
titers (LOG 10 (TI)=3.7±0.2), which are approximately 60 
times higher than those observed after injection of the 
plasmid pcDNA-S (p<10~ 5 ). 

[0624] The efficiency of the posttranscriptional signals 
was studied by carrying out a dose-response study of the 
anti-S antibody titers induced in the BALB/c mouse as a 
function of the quantity of plasmid DNA used as immuno- 
gen (2 tig, 10 ug and 50 ug). This study (FIG. 28B) 
demonstrates that the posttranscriptional signal WPRE 
greatly improves the efficiency of gene immunization when 
small doses of DNA are used (p<10~ 5 for a dose of 2 ug of 
DNA and p<10 -2 for a dose of 10 jig), whereas the effect of 
the CTE signal remains marginal (p=0.34 for a dose of 2 ug 
of DNA). 

[0625] Finally, the antibodies induced in mice after gene 
immunization neutralize the infectivity of SARS-CoV in 
vitro (FIGS. 29A and 29B) at titers which are consistent with 
the titers measured by ELISA. 



[0626] In summary, the use of a splice signal and of the 
posttranscriptional signal WPRE of the woodchuck hepatitis 
vims considerably improves the induction of neutralizing 
antibodies directed against SARS-CoV after gene immuni- 
zation with the aid of plasmid DNA directing the expression 
of the cDNA for SARS-CoV S. 

EXAMPLE 13 

Diagnostic Applications of the S Protein 

[0627] The ELISA reactivity of the recombinant Ssol 
polypeptide was analyzed with respect to sera from patients 
suffering from SARS. 

[0628] The sera from probable cases of SARS tested were 
chosen on the basis of the results (positive or negative) of 
analysis of their specific reactivity toward the native anti- 
gens of SARS-CoV by immunofluorescence test on VeroE6 
cells infected with SARS-CoV and/or by indirect ELISA test 
using as antigen a lysate of VeroE6 cells infected with 
SARS-CoV. The sera of these patients are identified by a 
serial number of the National Reference Center for Influenza 
Viruses and by the initials of the patient and the number of 
days elapsed since the onset of the symptoms. All the sera 
of probable cases (cf. Table XII) recognize the native 
antigens of SARS-CoV, with the exception of the serum 
032552 of the patient VTT for whom infection with SARS- 
CoV could not be confirmed by RT-PCR performed on 
respiratory samples of days 3, 8 and 12. A panel of control 
sera was used as control (TV sera): they are sera collected 
in France before the SARS epidemic that occurred in 2003. 

TABLE XII 



Sera of probable cases of SARS 

Sample collection 

Serum Patient day 

031724 JYK 7 

033168 JYK 38 

033597 JYK 74 

032632 NTM 17 
032634 THA 15 

032541 PHV 10 

032542 N1H 17 
032552 VTT 8 

032633 PTU 16 
032791 JLB 3 
033258 JLB 27 
032703 JCM 8 
033153 JCM 29 



[0629] Solid phases sensitized with the recombinant Ssol 
polypeptide were prepared by adsorption of a solution of 
purified Ssol polypeptide at 2 ug/ml in PBS in the wells of 
an ELISA plate, and then the plates are incubated overnight 
at 4° C. and washed with PBS-Tween buffer (PBS, 0.1% 
Tween 20). After saturating the ELISA plates with a solution 
of PBS-10% skimmed milk (weight/volume) and washing in 
PBS-Tween, the sera to be tested (100 ul) are diluted Vaoo in 
PBS skimmed milk-Tween buffer (PBS, 3% skimmed milk, 
0.1% Tween) and then added to the wells of the sensitized 
ELISA plate. The plates are incubated for 1 h at 37° C. After 
3 washings with PBS-Tween buffer, the anti-human IgG 
conjugate labeled with peroxidase (ref. NA933V, Amer- 
sham) diluted Viooo in PBS-skimmed milk-Tween buffer is 
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added, and then the plates are incubated for 1 hour at 37° C. 
After 6 washings with PBS-Tween buffer, the chromogen 
(TMB) and the substrate (H 2 0,) are added and the plates are 
incubated for 10 minutes protected from light. The reaction 
is stopped by adding a 1 N H 3 P0 4 solution, and then the 
absorbance is measured at 450 nm with a reference at 620 



[0630] The ELISA tests (FIG. 30) demonstrate that the 
recombinant Ssol polypeptide is specifically recognized by 
the serum antibodies of patients suffering from SARS col- 
lected at the medium or late phase of infection (glO days 
after the onset of the symptoms) whereas it is not signifi- 
cantly recognized by the serum antibodies of 2 patients (JLB 
and JCM) collected in the early phase of infection (3 to 8 
days after the onset of the symptoms) or by control sera of 
subjects not suffering from SARS. The serum antibodies of 
patients JLB and JCM show a seroconversion between days 
3 and 27 for the first and 8 and 29 for the second after the 
onset of the symptoms, which confirms the specificity of the 
reactivity of these sera toward the Ssol polypeptide. 

[0631] In conclusion, these results demonstrate that the 
recombinant Ssol polypeptide may be used as an antigen for 
the development of an ELISA test for serological diagnosis 
of infection with SARS-CoV. 

EXAMPLE 14 

Vaccine Applications of the Recombinant Soluble S 
Protein 

[0632] The immunogenicity of the recombinant Ssol 
polypeptide was studied in mice. 

[0633] For that, a group of 6 mice was immunized at 3 
weeks' interval with 10 ug of recombinant Ssol polypeptide 
adjuvanted with 1 mg of aluminum hydroxide (Alu-gel-S, 
Serva) diluted in PBS. Three successive immunizations 
were performed and the immune sera were collected 3 
weeks after each of the immunizations (IS1, IS2, IS3). As a 
control, a group of mice (mock group) received aluminum 
hydroxide alone according to the same protocol. 

[0634] The immune sera were analyzed per pool for each 
of the 2 groups by indirect ELISA using a lysate of VeroE6 
cells infected with SARS-CoV as antigen and as a control a 
lysate of noninfected VeroE6 cells. The anti-SARS-CoV 
antibody titers are calculated as the reciprocal of the dilution 
producing a specific OD of 0.5 after visualization with an 
anti-mouse IgG(H+L) polyclonal antibody coupled with 
peroxidase (NA931V, Amersham) and TMB supplemented 
with H 2 0 2 (KPL). This analysis (FIG. 31) shows that the 
immunization with the Ssol polypeptide induces in mice, 
from the first immunization, antibodies directed against the 
native form of the SARS-CoV spicule protein present in the 
lysate of infected VeroE6 cells. After 2 then 3 immuniza- 
tions, the anti-S antibody titers become very high. 

[0635] The immune sera were analyzed per pool for each 
of the two groups for their capacity to seroneutralize the 
infectivity of SARS-CoV. 4 points of seroneutralization on 
FRhK-4 cells (1 00 TCID50 of SARS-CoV) are produced for 
each of the 2-fold dilutions tested from V20. The seroneu- 
tralizing titer is calculated according to the Reed and Mun- 
sch method as the reciprocal of the dilution neutralizing the 
infectivity of 2 wells out of 4. This analysis shows that the 



antibodies induced in mice by the Ssol polypeptide are 
neutralizing: the titers observed are very high after 2 and 
then 3 immunizations (greater than 2560 and 5120 respec- 
tively, table XIII). 

TABLE XIII 



Induction of antibodies directed against 
SARS-CoV after immunization with the recombinant Ssoi 
polypeptide. The immune sera were analyzed per pool for 
each of the two groups for their capacity to 
seroneutralize the infectivity of 100 TCID50 of SARS- 
CoV on FRhK-4 cells. 4 points are produced for each of 
the 2-fold dilutions tested from 1/20. The 
seroneutralizing titer is calculated according to the 
Reed and Munsch method as the reciprocal of the 
dilution neutralizing the infectivity of 2 wells out of 4. 

Group Sera Neutralizing Ab 

Mock pi <20 

151 <20 

152 <20 

153 <20 

151 57 

152 >2S60 

153 >5120 



[0636] The neutralizing titers observed in mice immu- 
nized with the Ssol polypeptide reach levels far greater than 
the titers observed by Yang et al. in mice (2004, Nature, 
428:561-564) and those observed by Buchholz in the ham- 
ster (2004, PNAS 101:9804-9809) which protect respec- 
tively mice and hamsters from infection with SARS-CoV. It 
is therefore probable that the neutralizing antibodies induced 
in mice after immunization with the Ssol polypeptide protect 
these animals against infection with SARS-CoV. 

EXAMPLE 15 

Optimized Synthetic Gene for the Expression in 

Mammalian Cells of the SARS-Associated 
Coronaviras (SARS-CoV) Spicule (S) Protein 
1) Design of the Synthetic Gene 

[0637] A synthetic gene encoding the SARS-CoV spicule 
protein was designed from the gene of the isolate 031589 
(plasmid pSARS-S, C.N.C.M. No. 1-3059) so as to allow 
high levels of expression in mammalian cells and in par- 
ticular in cells of human origin. 
[0638] For that: 

[0639] the use of codons of the wild-type gene of the 
isolate 031589 was modified so as to become close to 
the bias observed in humans and to improve the effi- 
ciency of translation of the corresponding mRNA 

[0640] the overall GC content of the gene was increased 
so as to extend the half-life of the corresponding 
mRNA 

[0641] the optionally cryptic motifs capable of interfer- 
ing with an efficient expression of the gene were 
deleted (splice donor and acceptor sites, polyadenyla- 
tion signals, sequences very rich (>80%) or very low 
(<30%) in GC, repeat sequences, sequences involved in 
the formation of secondary RNA structures, TATA 
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[0642] a second STOP codon was added to allow effi- 
cient termination of translation. 

[0643] In addition, CpG motifs were introduced into the 
gene so as to increase its immunogenicity as DNA vaccine. 
In order to facilitate the manipulation of the synthetic gene, 
two BamH 1 and Xho 1 restriction sites were placed on either 
side of the open reading frame of the S protein, and the 
BamHl, Xhol, Nhel, Kpnl, BspEl and Sail restriction 
sites were avoided in the synthetic gene. 

[0644] The sequence of the synthetic gene designed (gene 
040530) is given in SEQ ID No: 140. 

[0645] An alignment of the synthetic gene 040530 with 
the sequence of the wild-type gene of the isolate 031589 of 
SARS-CoV deposited at the C.N.C.M. under the number 
1-3059 (SEQ ID No: 4, plasmid pSRAS-S) is presented in 
FIG. 32. 

2) Plasmid Constructs 

[0646] The synthetic gene SEQ ID No: 1 40 was assembled 
from synthetic oligonucleotides and cloned between the 
Kpnl and Sacl sites of the plasmid pUC-Kana in order to 
give the plasmid 040530pUC-Kana. The nucleotide 
sequence of the insert of the plasmid 040530pUC-Kana was 
verified by automated sequencing (Applied). 

[0647] A Kpnl -Xhol fragment containing the synthetic 
gene 040530 was excised from the plasmid 040530pUC- 
Kana and subcloned between the Nhel and Xhol sites of the 
expression plasmic pCI (Promega) in order to obtain the 
plasmid pCI-SSYNTH, deposited at the CNCM on Dec. 1, 
2004, under the number 1-3333. 

[0648] A synthetic gene encoding the soluble form of the 
S protein was then obtained by fusing the synthetic 
sequences encoding the ectodomain of the S protein (amino 
acids 1 to 1193) with those of the tag (FLAG:DYKDDDDK) 
via a linker BspEl encoding the dipeptide SG. Practically, a 
DNA fragment encoding the ectodomain of the SARS-CoV 
S was amplified by PCR with the aid of the oligonucleotides 
5'-ACT AGCTAGCGGATCCA CCATGTTCATCTT CCTG- 
3' and 5'-AGTA TCCGGAC TTG ATGTACT GCTCG- 
TACTTGC-3' from the plasmid 040530pUC-Kana, digested 
with Nhel and BspEl and then inserted between the unique 
Nhel and BspEl sites of the plasmid pCI-Ssol, to give the 
plasmid pCI-SCUBE, deposited at the CNCM on Dec. 1, 
2004, under the number 1-3332. The plasmids pCI-Ssol, 
pCI-Ssol-CTE, and pCI-Ssol-WPRE (deposited at the 
CNCM, on Nov. 22, 2004, under the number 1-3324) had 
been previously obtained by subcloning the Kpnl -Xhol 
fragment excised from the plasmid pcDNA-Ssol (see tech- 
nical note of DI 2004-106) between the Nhel and Xhol sites 
of the plasmids pCI, pCI-S-CTE and pCI-S-WPRE respec- 

[0649] The plasmids pCI-Scube and pCI-Ssol encode the 
same recombinant Ssol polypeptide. 

3) Results 

[0650] The capacity of the synthetic gene encoding the S 
protein to efficiently direct the expression of the SARS-CoV 
S in mammalian cells was compared with that of the 
wild-type gene after transient transfection of primate cells 
(VeroE6) and of human cells (293T). 



[0651] In the experiment presented in FIG. 33 and in table 
XIV, monolayers of 5xl0 5 VeroE6 cells or 7xl0 5 293T cells 
in 35 mm Petri dishes were transfected with 2 g of plasmids 
pCI (as control), pCI-S, pCI-S-CTE, pCI-S-WPRE and 
pCI-S-Ssynth and 6 ul of Fugene6 reagent according to the 
manufacturer's instructions (Roche). After 48 hours of incu- 
bation at 37° C. and under 5% C0 2 , cell extracts were 
prepared in loading buffer according to Laemmli, separated 
on 8% SDS polyacrylamide gel and then transferred onto a 
PVDF membrane (BioRad). The detection of this immuno- 
blot (Western blot) was carried out with the aid of an anti-S 
rabbit polyclonal serum (immune serum of the rabbit 
P11135: cf example 4 above) and of donkey polyclonal 
antibodies directed against rabbit IgGs and coupled with 
peroxidase (NA934V, Amersham). The immunoblot was 
quantitatively visualized by luminescence with the aid of the 
ECL+ kit (Amersham) and acquisition on a digital imaging 
device (Fluor S, BioRad). 

[0652] The analysis of the results obtained with the soft- 
ware QuantityOne v4.2.3 (BioRad) shows that in this experi- 
ment, the plasmid pCI -Synth allows the transient expression 
of the S protein at high levels in the VeroE6 and 293T cells, 
whereas the plasmid pCI-S does not make it possible to 
induce expression at sufficient levels to be detected. The 
expression levels observed are of the order of twice as high 
as those observed with the plasmid pCI-S-WPRE. 

TABLE XIV 



Use of a synthetic gene for the expression 
of the SARS-CoV S. Cell extracts prepared 48 hours 
after transfection of VeroE6 or 293T cells with the 
plasmids pCI, pCI-S, pCI-S-CTE, pCI-S-WPRE and pCI-S- 
Ssynth were separated on 8% SDS acrylamide gel and 
analyzed by Western blotting with the aid of an anti-S 
rabbit polyclonal antibody and an anti-rabbit IgG (H + L) 
polyclonal antibody coupled with peroxidase (NA934V, 

Amersham). The Western blot is visualized by 
luminescence (ECL+, Amersham) and acquisition on a 
digital imaging device (FluorS, BioRad). The expression 

levels of the S protein were measured by quantifying 
the two predominant bands identified on the image (see 
FIG. 33) and are indicated according to an arbitrary 
scale where the value 1 represents the level measured 
after transfection of the plasmid pCI-S-WPRE. 

Plasmid VeroE6 293T 



pCI 0.0 0.0 

pCI-S S0.1 S0.1 

pCI-S-CTE 0.5 S0.1 

pCI-S-WPRE 1.0 1.0 

pCI-Ssynth 1.8 1.9 



[0653] In a second instance, the capacity of the synthetic 
gene Scube to efficiently direct the synthesis and the secre- 
tion of the Ssol polypeptide by mammalian cells was com- 
pared with that of the wild-type gene after transient trans- 
fection of hamster cells (BHK-21) and of human cells 
(293T). 

[0654] In the experiment presented in table XV, monolay- 
ers of 6x10 s BHK-21 cells and 7xl0 5 293T cells in 35 mm 
Petri dishes were transfected with 2 ug of plasmids pCI (as 
control), pCI-Ssol, pCI-Ssol-CTE, pCI-Ssol-WPRE and 
pCI-Scube and 6 ul of Fugene6 reagent according to the 
manufacturer's instructions (Roche). After 48 hours of incu- 
bation at 37° C. and under 5% C0 2 , the cellular supernatants 
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were collected and quantitatively analyzed for the secretion 
of the Ssol polypeptide by a capture ELISA test specific for 
the Ssol polypeptide. 

[0655] Analysis of the results shows that, in this experi- 
ment, the plasmid pCI-Scube allows the expression of the 
Ssol polypeptide at levels 8 times (BHK-21 cells) to 20 
times (293T cells) higher than the plasmid pCI-Ssol. 

[0656] The levels of expression observed are of the order 
of twice (293T cells) to 5 times (BHK-21 cells) as high as 
those observed with the plasmid pCI-Ssol-WPRE. 



; of a synthetic j 



48 hours after transfection of BHK or 293T cells with 
e plasmids pCI, pCI-Ssol, pCI-Ssol-CTE, pCI-Ssol-WPRE 
and pCI-Scube and quantitatively analyzed for the 
secretion of the Ssol polypeptide by an ELISA test 
specific for the Ssol polypeptide. The transfections 



pCI-Ssol 

pCI-Ssol-CTE 

pCI-Ssol-WPRE 



[0657] In summary, these results show that the expression, 
in mammalian cells, of the synthetic gene 040530 encoding 
SARS-CoV S under the control of RNA polymerase II 
promoter sequences is much more efficient than that of the 
wild-type gene of the 031589 isolate. This expression is 
even more efficient than that directed by the wild-type gene 
in the presence of the WPRE sequences of the woodchuck 



4) Applications 

[0658] The use of the synthetic gene 040530 encoding 
SARS-CoV S or its Scube variant encoding the polypeptide 
Ssol is capable of advantageously replacing the wild-type 
gene in numerous applications where the expression of S is 
necessary at high levels. In particular in order to: 

[0659] improve the efficiency of gene immunization 
with plasmids of the pCI-Ssynth or even pCI-Ssynth- 
CTE or pCI-Ssynth-WPRE type 

[0660] establish novel cell lines expressing higher 
quantities of the S protein or of the Ssol polypeptide 
with the aid of recombinant lentiviral vectors carrying 
the Ssynth gene or the Scube gene respectively 

[0661] improve the immunogenicity of the recombinant 
lentiviral vectors allowing the expression of the S 
protein or of the Ssol polypeptide 

[0662] improve the immunogenicity of live vectors 
allowing the expression of the S protein or of the Ssol 
polypeptide like recombinant vaccinia viruses or 
recombinant measles viruses (see examples 16 and 17 
below) 



Expression of the SARS-Associated Coronavirus 
(SARS-CoV) Spicule (S) Protein with the Aid of 
Recombinant Vaccinia Viruses 

Vaccine Application 

Application to the Production of a Soluble form of the 
Spicule (S) Protein and Design of a Serological Test for 
SARS 

1) Introduction 

[0663] The aim of this example is to evaluate the capacity 
of recombinant vaccinia viruses (W) expressing various 
SARS-associated coronavirus (SARS-CoV) antigens to con- 
stitute novel vaccine candidates against SARS and a means 
of producing recombinant antigens in mammalian cells. 

[0664] For that, the inventors focused on the SARS-CoV 
spicule (S) protein which makes it possible to induce, after 
gene immunization in animals, antibodies neutralizing the 
infectivity of SARS-CoV, and a soluble and secreted form of 
this protein, the Ssol polypeptide, which is composed of the 
ectodomain (aa 1 -1 1 93) of S fused at its C-ter end with a tag 
FLAG (DYKJDDDDK) via a BspEl linker encoding the SG 
dipeptide. This Ssol polypeptide exhibits an antigenicity 
similar to that of the S protein and allows, after injection into 
mice in the form of a purified protein adjuvanted with 
aluminum hydroxide, the induction of high neutralizing 
antibody titers against SARS-CoV. 

[0665] The various forms of the S gene were placed under 
the control of the promoter of the 7.5K gene and then 
introduced into the thymidine kinase (TK) locus of the 
Copenhagen strain of the vaccinia virus by double homolo- 
gous recombination in vivo. In order to improve the immu- 
nogenicity of the recombinant vaccinia viruses, a synthetic 
late promoter was chosen in place of the 7.5K promoter, in 
order to increase the production of S and Ssol during the late 
phases of the viral cycle. 

[0666] After having isolated the recombinant vaccinia 
viruses and verified their capacity to express the SARS-CoV 
S antigen, their capacity to induce in mice an immune 
response against SARS was tested. After having purified the 
Ssol antigen from the supernatant of infected cells, an 
ELISA test for serodiagnosis of SARS was designed, and its 
efficiency was evaluated with the aid of sera from probable 
cases of SARS. 

2) Construction of the Recombinant Viruses 

[0667] Recombinant vaccinia viruses directing the expres- 
sion of the S glycoprotein of the 031589 isolate of SARS- 
CoV and of a soluble and secreted form of this protein, the 
Ssol polypeptide, under the control of the 7.5K promoter 
were obtained. With the aim of increasing the levels of 
expression of S and Ssol, recombinant viruses in which the 
cDNAs for S and for Ssol are placed under the control of a 
late synthetic promoter were also obtained. 

[0668] The plasmid pTG186poly is a transfer plasmid for 
the construction of recombinant vaccinia viruses (Kieny, 
1986, Biotechnology, 4:790-795). As such, it contains the 
W thymidine kinase gene into which the promoter of the 
7.5K gene has been inserted followed by a multiple cloning 
site allowing the insertion of heterologous genes (FIG. 34A). 
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The promoter of the 7.5K gene in fact contains a tandem of 
two promoter sequences that are respectively active during 
the early (P E ) and late (P L ) phases of the vaccinia virus 
replication cycle. The BamHl-Xhol fragments were excised 
from the plasmids pTRlP-S and pcDNA-Ssol respectively 
and inserted between the BamHl and Smal sites of the 
plasmid pTG186poly in order to give the plasmids pTG-S 
and pTG-Ssol (FIG. 34A). The plasmids pTG-S and pTG- 
Ssol were deposited at the CNCM, on Dec. 2, 2004, under 
the numbers 1-3338 and 1-3339, respectively. 
[0669] The plasmids pTN480, pTN-S and pTN-Ssol were 
obtained from the plasmids pTG186poly, pTG-S and pTG- 
Ssol respectively, by substituting the Ndel-Pstl fragment 
containing the 7.5K promoter by a DNA fragment contain- 
ing the synthetic late promoter 480, which was obtained by 
hybridization of the oligonucleotides 5-TATGAGCTTT 
TTTTTTTTTT TTTTTTTGGC ATATAAATAG ACTCG- 
GCGCG CCATCTGCA-3' and 5'-GATGGCGCGC- 
CGAGTCTATT TATATGCCAA AAAAAAAAAA 
AAAAAAAAGC TCA-3 1 (FIG. 34B). The insert was 
sequenced with the aid of a BigDye Terminator vl.l kit 
(Applied Biosystems) and an automated sequencer ABI377. 
The sequence of the late synthetic promoter 480 as cloned 
into the transfer plasmids of the pTN series is indicated in 
FIG. 34C. The plasmids pTN-S and pTN-Ssol were depos- 
ited at the CNCM, on Dec. 2, 2004, under the numbers 
1-3340 and 1-3341, respectively. 

[0670] The recombinant vaccinia viruses were obtained by 
double homologous recombination in vivo between the TK 
cassette of the transfer plasmids of the series pTG and pTN 
and the TK gene of the Copenhagen strain of the vaccinia 
virus according to a procedure described by Kieny et al. 
(1984, Nature, 312:163-166). Briefly, CV-1 cells are trans- 
fected with the aid of DOTAP (Roche) with genomic DNA 
of the Copenhagen strain of the vaccinia virus and each of 
the transfer plasmids of die pTG and pTN series described 
above, and then superinfected with the helper vaccinia virus 
W-ts7 for 24 hours at 33° C. The helper virus is counter- 
selected by incubation at 40° C. for 2 days and then the 
recombinant viruses (TK-phenotype) selected by two clon- 
ing cycles under agar medium on 143Btk-cells in the pres- 
ence of BuDr (25 ug/ml). The 6 viruses W-TG, W-TG-S, 
W-TG-Ssol, W-TN, W-TN-S, and W-TN-Ssol are 
respectively obtained with the aid of the transfer plasmids 
pTG186poly, pTG-S, pTG-Ssol, pTN480, pTN-S, pTN- 
Ssol. The viruses W-TG and W-TN do not express any 
heterologous gene and were used as TK-control in the 
experiments. The preparations of recombinant viruses were 
performed on monolayers of CV-1 or BHK-21 cells and the 
titer in plaque forming units (p.f.u) determined on CV-1 cells 
according to Earl and Moss (1998, Current Protocols in 
Molecular Biology, 16.16.1-16.16.13). 
3) Characterization of the Recombinant Viruses 
[0671] The expression of the transgenes encoding the S 
protein and the Ssol polypeptide was assessed by Western 
blotting. 

[0672] Monolayers of CV-1 cells were infected at a mul- 
tiplicity of 2 with various recombinant vaccinia viruses 
W-TG, W-TG-S, W-TG-Ssol, W-TN, W-TN-S and W- 
TN-Ssol. After 18 hours of incubation at 37° C. and under 
5% C02, cellular extracts were prepared in loading buffer 
according to Laemmli, separated on 8% SDS polyacryla- 



mide gel and then transferred onto a PVDF membrane 
(BioRad). The detection of this immunoblot (Western blot) 
was performed with the aid of an anti-S rabbit polyclonal 
serum (immune serum from the rabbit PI 1135: cf. example 
4) and donkey polyclonal antibodies directed against rabbit 
IgGs and coupled with peroxidase (NA934V, Amersham). 
The bound antibodies were visualized by luminescence with 
the aid of the ECL+ kit (Amersham) and autoradiography 
films Hyperfilm MP (Amersham). 

[0673] As shown in FIG. 35A, the recombinant virus 
W-TN-S directs the expression of the S protein at levels 
which are comparable to those which can be observed 8 h 
after infection with SARS-CoV but which are much higher 
than those which can be observed after infection with 
W-TG-S. In a second experiment (FIG. 35B), the analysis 
of variable quantities of cellular extracts shows that the 
levels of expression observed after infection with viruses of 
the TN series (W-TN-S and W-TN-Ssol) are about 10 
times as high as those observed with the viruses of the TG 
series (W-TG-S and W-TG-Ssol, respectively). In addi- 
tion, the Ssol polypeptide is secreted into the supernatant of 
CV-1 cells infected with the W-TN-Ssol virus more effi- 
ciently than in the supernatant of cells infected with W-TG- 
Ssol (FIG. 36A). In this experiment, the W-TN-Sflag virus 
was used as a control because it expresses the membrane 
form of the S protein fused at its C-ter end with the FLAG 
tag. Hie Sflag protein is not detected in the supernatant of 
cells infected with W-TN-Sflag, demonstrating that the 
Ssol polypeptide is indeed actively secreted after infection 
with W-TN-Ssol. 

[0674] These results demonstrate that the recombinant 
vaccinia viruses are indeed carriers of the transgenes and 
allow the expression of the SRAS glycoprotein in its mem- 
brane form (S) or in a soluble or secreted form (Ssol). The 
vaccinia viruses carrying the synthetic promoter 480 allow 
the expression of S and the secretion of Ssol at levels much 
higher than the viruses carrying the promoter of the 7.5K 
gene. 

4) Application to the Production of a Soluble Form of 
SARS-CoV S. Purification of this Recombinant Antigen and 
Diagnostic Applications 

[0675] The BHK-2 1 line is the cell line which secretes the 
highest quantities of Ssol polypeptide after infection with 
the W-TN-Ssol virus among the lines tested (BHK-21, 
CV1, 293T and FrhK-4, FIG. 36B); it allows the quantitative 
production and purification of the recombinant Ssol 
polypeptide. In a typical experiment where the experimental 
conditions for infection, production and purification were 
optimized, the BHK-21 cells are inoculated in standard 
culture medium (pyruvate-free DMEM containing 4.5 g/1 of 
glucose and supplemented with 5% TPB, 5% FCS, 1 00 U/ml 
of penicillin and 100 u.g/ml of streptomycin) in the form of 
a subconfluent monolayer (10 million cells for each 100 cm 2 
in 25 ml of medium). After 24 h of incubation at 37° C. 
under 5% C0 2 , the cells are infected at an M.O.I, of 0.03 and 
the standard medium replaced with the secretion medium 
where the quantity of FCS is reduced to 0.5% and the TPB 
eliminated. The culture supernatant is removed after 2.5 
days of incubation at 35° C. and under 5% C0 2 and the 
vaccinia virus inactivated by addition of Triton X-100 
(0.1%). After filtration on 0.1 um polyethersulfone (PES) 
membrane, the recombinant Ssol polypeptide is purified by 
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affinity chromatography on an anti-FLAG matrix with elu- 
tion with a solution of FLAG peptide (DYKDDDDK) at 100 
Hg/ml in TBS (50 mM Tris, pH 7.4, 150 mM NaCl). 

[0676] The analysis by 8% SDS acrylamide gel stained 
with silver nitrate identified a predominant polypeptide 
whose molecular mass is about 1 80 kD and whose degree of 
purity is greater than 90% (FIG. 37). The concentration of 
the purified Ssol recombinant polypeptide was determined 
by comparison with molecular mass markers and estimated 
at 24 ng/ul. 

[0677] This purified Ssol polypeptide preparation makes it 
possible to produce a calibration series in order to measure, 
with the aid of a capture ELISA test, the Ssol concentrations 
present in the culture supernatants. According to this test, the 
BHK-21 line secretes about 1 g/ml of Ssol polypeptide 
under the production conditions described above. In addi- 
tion, the purification scheme presented makes it possible to 
purify of the order of 1 60 ug of Ssol polypeptide per liter of 
culture supernatant. 

[0678] The ELISA reactivity of the recombinant Ssol 
polypeptide was analyzed toward sera from patients suffer- 
ing from SARS. 

[0679] The sera of probable cases of SARS tested were 
chosen on the basis of the results (positive or negative) of 
analysis of their specific reactivity toward the native anti- 
gens of SARS-CoV by immunofluorescence test on VeroE6 
cells infected with SARS-CoV and/or by indirect ELISA test 
using, as antigen, a lysate of VeroE6 cells infected with 
SARS-CoV. The sera of these patients are identified by a 
serial number of the National Reference Center for Influenza 
Viruses and by the patient's initials and the number of days 
elapsed since the onset of the symptoms. All the sera of 
probable cases (cf. table XVI) recognize the native antigens 
of SARS-CoV with the exception of the serum 032552 of the 
patient VTT, for which infection with SARS-CoV could not 
be confirmed by RT-PCR performed on respiratory samples 
of days 3, 8 and 12. A panel of control sera was used as 
control (TV sera): they are sera collected in France before 
the SARS epidemic which occurred in 2003. 

TABLE XVI 



Sera of probable cases of SARS 



Serum Patient Sample collection day 

033168 JYK 38 • 

033597 JYK 74 

032632 NTM 17 
032634 THA 15 

032542 NTH 17 

032633 PTU 16 



[0680] Solid phases sensitized with the recombinant Ssol 
polypeptide were prepared by adsorption of a solution of 
purified Ssol polypeptide at 4 ug/ml in PBS in the wells of 
an ELISA plate. The plates are incubated overnight at 4° C. 
and then washed with PBS-Tween buffer (PBS, 0.1% Tween 
20). After washing with PBS-Tween, the sera to be tested 
(100 ul) are diluted Moo and Vaoo in PBS-skirnined milk- 
Tween buffer (PBS, 3% skimmed milk, 0.1% Tween) and 
then added to the wells of the sensitized ELISA plate. The 



plates are then incubated for 1 h at 37° C. After 3 washings 
with PBS-Tween buffer, the anti-human IgG conjugate 
labeled with peroxidase (ref. NA933V, Amersham) diluted 
'/4ooo in PBS-skimmed milk-Tween buffer is added and then 
the plates are incubated for one hour at 37° C. After 6 
washings with PBS-Tween buffer, the chromogen (TMB) 
and the substrate (H,0 2 ) are added and the plates are 
incubated for 10 minutes protected from light. The reaction 
is stopped by adding a 1M solution of H 3 P0 4 and then the 
absorbance is measured at 450 mn with a reference at 620 

[0681] The ELISA tests (FIG. 38) demonstrate that the 
recombinant Ssol polypeptide is specifically recognized by 
the serum antibodies of patients suffering from SARS, 
collected at the middle or late phase of infection (ii 10 days 
after the onset of the symptoms), whereas it is not signifi- 
cantly recognized by the serum antibodies of the control sera 
of subjects not suffering from SARS. 

[0682] In conclusion, these results demonstrate that the 
recombinant Ssol polypeptide can be purified from the 
supernatant of mammalian cells infected with the recombi- 
nant vaccinia virus W-TN-Ssol and can be used as antigen 
for developing an ELISA test for serological diagnosis of 
infection with SARS-CoV. 

5. Vaccine Applications 

[0683] The immunogentcrty of the recombinant vaccinia 
viruses was studied in mice. 

[0684] For that, groups of 7 BALB/c mice were immu- 
nized by the i.v. route twice at 4 weeks' interval with 10 s 
p.f.u. of recombinant vaccinia viruses W-TG, W-TG-S, 
W-TG-Ssol, W-TN W-TN-S and W-TN-Ssol and, as a 
control, W-TG-HA which directs the expression of hemag- 
glutinin of the A/PR/8/34 strain of the influenza virus. The 
immune sera were collected 3 weeks after each of the 
immunizations (IS1, IS2). 

[0685] The immune sera were analyzed per pool for each 
of the groups by indirect ELISA using a lysate of VeroE6 
cells infected with SARS-CoV as antigen and, as control, a 
lysate of noninfected VeroE6 cells. The anti-SARS-CoV 
antibody titers (TI) are calculated as the reciprocal of the 
dilution producing a specific OD of 0.5 after visualization 
with an anti-mouse IgG(H+L) polyclonal antibody coupled 
with peroxidase (NA931V, Amersham) and TMB supple- 
mented with H 2 0 2 (KPL). This analysis (FIG. 39A) shows 
that immunization with the virus VV-TG-S and W-TN-S 
induces in mice, from the first immunization, antibodies 
directed against the native form of the SARS-CoV spicule 
protein present in the lysate of infected VeroE6 cells. The 
responses induced by the W-TN-S virus are higher than 
those induced by the W-TG-S virus after the first (TI=740 
and TI=270 respectively) and the second (Tl=3230 and 
TI=600 respectively) immunization. The W-TN-Ssol virus 
induces high anti-SARS-CoV antibody titers after two 
immunizations (TI=640), whereas the virus W-TG-Ssol 
induces a response at the detection limit (TI=40). 

[0686] The immune sera were analyzed per pool for each 
of the groups for their capacity to seroneutralize the infec- 
tivity of SARS-CoV. 4 seroneutralization points on FRliK-4 
cells (100 TCID50 of SARS-CoV) are produced for each of 
the 2-fold dilutions tested from Vzo. The seroneutralizing 
titer is calculated according to the Reed and Munsch method 
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as the reciprocal of the dilution neutralizing the infectivity of 
2 wells out of 4. This analysis shows that the antibodies 
induced in mice by the vaccinia viruses expressing the S 
protein or the Ssol polypeptide are neutralizing and that the 
viruses with synthetic promoters are more efficient immu- 
nogens than the viruses carrying the 7.5K promoter: the 
highest titers (640) are observed after 2 unmumzations with 
the virus W-TN-S (FIG. 39B). 

[0687] The protective power of the neutralizing antibodies 
induced in mice after immunization with the recombinant 
vaccinia viruses is evaluated with the aid of a challenge 
infection with SARS-CoV. 
6) Other Applications 

[0688] Third generation recombinant vaccinia viruses are 
constructed by substituting the wild-type sequences of the S 
and Ssol genes by synthetic genes optimized for the expres- 
sion in mammalian cells, described above. These recombi- 
nant vaccinia viruses are capable of expressing larger quan- 
tities of S and Ssol antigens and therefore of exhibiting 
increased immunogenicity. 

[0689] The recombinant vaccinia virus W-TN-Ssol can 
be used for the quantitative production and purification of 
the Ssol antigen for diagnostic (serology by ELISA) and 
vaccine (subunit vaccine) applications. 

EXAMPLE 17 
Recombinant Measles Virus Expressing the 
SARS-Associated Coronavirus (SARS-CoV) 
Spicule (S) Protein. Vaccine Applications 
1) Introduction 

[0690] The measles vaccine (MV) induces a lasting pro- 
tective immunity in humans after a single injection (Hille- 
man, 2002, Vaccine, 20: 651-665). The protection conferred 
is very robust and is based on the induction of an antibody 
response and of a CD4 and CDS cell response. The MV 
genome is very stable and no reversion of the vaccine strains 
to virulence has ever been observed. The measles virus 
belongs to the genus Morbillivirus of the Paramyxoviridae 
family; it is an enveloped virus whose genome is a 16 kb 
single-stranded RNA of negative polarity (FIG. 40A) and 
whose exclusively cytoplasmic replication cycle excludes 
any possibility of integration into the genome of the host. 
The measles vaccine is thus one of the most effective and 
one of the safest live vaccines used in the human population. 
Frederic Tangy's team recently developed an expression 
vector on the basis of the Schwarz strain of the measles 
virus, which is the safest attenuated strain and the most 
widely used in humans as vaccine against measles. This 
vaccine strain may be isolated from an infectious molecular 
clone while preserving its immuno-genicity in primates and 
in mice that are sensitive to the infection. It constitutes, after 
insertion of additional transcription units, a vector for the 
expression of heterologous sequences (Combredet, 2003, J. 
Virol. 77: 11546-11554). In addition, a recombinant MV 
Schwarz expressing the envelope glycoprotein of the West 
Nile virus (WNV) induces an effective and lasting antibody 
response which protects mice from a lethal challenge infec- 
tion with WNV (Despres et al., 2004, J. Infect. Dis., in 
press). All these characteristics make the attenuated Schwarz 
strain of the measles virus an extremely promising candidate 
vector for the construction of novel recombinant live vac- 



[0691] The aim of this example is to evaluate the capacity 
of recombinant measles viruses (MV) expressing various 
SARS-associated coronavirus (SARS-CoV) antigens to con- 
stitute novel candidate vaccines against SARS. 

[0692] The inventors focused on the SARS-CoV spicule 
(S) protein, which makes it possible to induce, after gene 
immunization in animals, antibodies neutralizing the infec- 
tivity of SARS-CoV, and on a soluble and secreted form of 
this protein, the Ssol polypeptide, which is composed of the 
ectodomain (aa 1-1193) of S fused at its C-ter end with a 
FLAG tag (DYKDDDDK) via a BspEl linker encoding the 
SG dipeptide. This Ssol polypeptide exhibits a similar 
antigenicity to that of the S protein and allows, after injec- 
tion into mice in the form of a purified protein adjuvanted 
with aluminum hydroxide, the induction of high neutralizing 
antibody titers against SARS-CoV. 

[0693] The various forms of the S gene were introduced in 
the form of an additional transcription unit between the P 
(phosphoprotein) and M (matrix) genes into the cDNA of the 
Schwarz strain of Mv previously described (Combredet, 
2003, J. Virol. 77: 11546-11554; EP application No. 
02291551.6 of Jun. 20, 2002, and EP application No. 
02291550.8 of Jun. 20, 2002). After having isolated the 
recombinant viruses MVSchw2-SARS-S and MVSchw2- 
SARS-Ssol and checked their capacity to express the SARS- 
CoV S antigen, their capacity to induce a protective immune 
response against SARS in mice and then in monkeys was 
tested. 

2) Construction of the Recombinant Viruses 

[0694] The plasmid pTM-MVSchw-ATU2 (FIG. 40B) 
contains an infectious cDNA corresponding to the antige- 
nome of the Schwarz vaccine strain of the measles virus 
(MV) into which an additional transcription unit (ATU) has 
been introduced between the P (phosphoprotein) and M 
(matrix) genes (Combredet, 2003, Journal of Virology, 77: 
11546-11554). Recombinant genomes MVSchw2-SARS-S 
and MVSchw2-SARS-Ssol of the measles virus were con- 
structed b\ insert ig OR] fthe I 1 1 and of the Ssol 
polypeptide into the additional transcription unit of the 
MVSchw-ATU2 vector. 

[0695] For that, a DNA fragment containing the SARS- 
CoV S cDNA was amplified by PCR with the aid of the 
oligo-nucleotides 5' - ATACGTACG A CCATGTTTAT 
TTTCTTATTA TTTCTTACTC TCACT-3' and 5'-AT- 
AGCGCGCT CATTATGTGT AATGTAATTT GACAC- 
CCTTG-3' using the plasmid pcDNA-S as template and then 
inserted into the plasmid pCR®2.1-TOPO (Invitrogen) in 
order to obtain the plasmid pTOPO-S-MV. The two oligo- 
nucleotides used contain restriction sites BsiWl and BssHII, 
so as to allow subsequent insertion into the measles vector, 
and were designed so as to generate a sequence of 3774 nt 
including the codons for initiation and termination, so as to 
observe the rule of 6 which stipulates that the length of the 
genome of a measles virus must be divisible by 6 (Calain & 
Roux, 1993, J. Virol, 67: 4822-4830; Schneider et al, 1997, 
Virology, 227: 314-322). The insert was sequenced with the 
aid of a BigDye Terminator vl.l kit (Applied Biosystems) 
and an automated sequencer ABI377. 

[0696] To express a soluble and secreted form of SARS- 
CoV S, a plasmid containing the cDNA of the Ssol polypep- 
tide corresponding to the ectodomain (aa 1 -1 1 93) of SARS- 
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CoV S fused at its C-ter end with the sequence of a FLAG 
tag (DYKDDDDK) via a BspEl linker encoding the SG 
dipeptide was then obtained. For that, a DNA fragment was 
amplified with the aid of the oligonucleotides 
5'-CCATTTCAAC AATTTGGCCG-3' and 5'-ATAGGATC- 
C GCGCGC TCATT ATTTATCGTC GTCATCTTTA 
TAATC-3' from the plasmid pcDNA-Ssol and then inserted 
into the plasmid pTOPO-S-MV between the Sail and 
BamHl sites in order to obtain the plasmid pTOPO-S-MV- 
SF. The sequence generated is 3618 nt long between the 
BsiWl and BssHII sites and observes the rule of 6. The 
insert was sequenced as indicated above. 
[0697] The BsiWl -BssHII fragments containing the 
cDNAs for the S protein and the Ssol polypeptide were then 
excised by digestion of the plasmids pTOPO-S-MV and 
pTOPO-S-MV-SF and then subcloned between the corre- 
sponding sites of the plasmid pTM-MVSchw-ATU2 in order 
to give the plasmids pTM-MVSchw2-SARS-S and pTM- 
MVSchw2-SARS-Ssol (FIG. 40B). These two plasmids 
were deposited at the C.N.C.M. on Dec. 1, 2004, under the 
numbers 1-3326 and 1-3327, respectively. 
[0698] The recombinant measles viruses corresponding to 
the plasmids P TM-MVSchw2-SARS-S and pTM- 
MVSchw2-SARS-Ssol were obtained by reverse genetics 
according to the system based on the use of a helper cell line, 
described by Radecke et al. (1995, Embo J., 14: 5773-5784) 
and modified by Parks et al. (1999, J. Virol., 73: 3560-3566). 
Briefly, the helper cells 293-3-46 are transfected according 
to the calcium phosphate method with 5 ug of the plasmids 
pTM-MVSchw2-SARS-S or P TM-MVSchw2-SARS-Ssol 
and 0.02 ug of the plasmid pEMC-La directing the expres- 
sion of the MV L polymerase (gift from M. A. Billeter). 
After incubating overnight at 37° C, a heat shock is pro- 
duced for 2 hours at 43° C. and the transfected cells are 
transferred onto a monolayer of Vero cells. For each of the 
two plasmids, syncytia appeared after 2 to 3 days of cocul- 
ture and were transferred successively onto monolayers of 
Vero cells at 70% confluence in 35 mm Petri dishes and then 
in 25 and 75 cm 2 flasks. When the syncytia have reached 
80-90% confluence, the cells are recovered with the aid of 
a scraper and then frozen and thawed once. After low-speed 
centrifugation, the supernatant containing the virus is stored 
in aliquots at -80° C. The titers of the recombinant viruses 
MVSchw2-SARS-S and MVSchw2-SARS-Ssol were deter- 
mined by limiting dilution on Vero cells and the titer as dose 
infecting 50% of the wells (TCID 50 ) calculated according to 
the Karber method. 

3) Characterization of the Recombinant Viruses 
[0699] The expression of the transgenes encoding the S 
protein and the Ssol polypeptide was assessed by Western 
blotting and immunofluorescence. 

[0700] Monolayers of Vero cells in T-25 flasks were 
infected at a multiplicity of 0.05 by various passages of the 
two viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol 
and the wild-type virus MWSchw as a control. When the 
syncytia had reached 80 to 90% confluence, cytoplasmic 
extracts were prepared in an extraction buffer (150 mM 
NaCl, 50 mM Tris-HCl, pH 7.2, 1% Triton X-100, 0.1% 
SDS, 1% DOC) and then diluted in loading buffer according 
to Laemmli, separated on 8% SDS polyacrylamide gel and 
transferred onto a PVDF membrane (BioRad). The detection 
of this immunoblot (Western blot) was carried out with the 



aid of an anti-S rabbit polyclonal serum (immune serum of 
the rabbit P11135: cf. example 4 above) and donkey poly- 
clonal antibodies directed against rabbit IgGs and coupled 
with peroxidase (NA934V, Amersham). The bound antibod- 
ies were visualized by luminescence with the aid of the 
ECL+ kit (Amersham) and Hyperfilm MP autoradiography 
films (Amersham). 

[0701] Vero cells in monolayers on glass slides were 
infected with the two viruses MVSchw2-SARS-S and 
MVSchw2-SARS-Ssol and the wild-type virus MWSchw as 
a control at multiplicities of infection of 0.05. When the 
syncytia had reached 90 to 100% (MVSchw2-SARS-Ssol 
virus) or 30 to 40% (MVSchw2-SARS-S, MWSchw) con- 
fluence, the cells were fixed in a 4% PBS-PFA solution, 
permeabilized with a PBS solution containing 0.2% Triton 
and then labeled with rabbit polyclonal antibodies hyperim- 
munized with purified and inactivated SARS-CoV virions 
and with an anti-rabbit IgG(H+L) goat antibody conjugate 
coupled with FITC (Jackson). 

[0702] As shown in FIGS. 41 and 42, the recombinant 
viruses MVSchw2-SARS-S and MVSchw2-SARS-Ssol 
direct the expression of the S protein and the Ssol polypep- 
tide respectively at levels comparable to those which can be 
observed 8 h after infection with SARS-CoV. The expres- 
sion of these polypeptides is stable after 3 passages of the 
recombinant viruses in cell culture. These results demon- 
strate that the recombinant measles viruses are indeed car- 
riers of the transgenes and allow the expression of the SARS 
glycoprotein in its membrane form (S) or in a soluble form 
(Ssol). The Ssol polypeptide is expected to be secreted by 
cells infected with the MVSchw2-SARS-Ssol virus as is the 
case when this same polypeptide is expressed in mammalian 
cells after transient transfection of the corresponding 
sequences (cf. example 11 above). 

4) Applications 

[0703] Having shown that the viruses MVSchw2-SARS-S 
and MVSchw2-SARS-Ssol allow the expression of the 
SARS-CoV S, their capacity to induce a protective immune 
response against SARS-CoV in CD46* /_ IFN-afJR - ' - mice, 
which is sensitive to infection by MV, is evaluated. The 
antibody response of the immunized mice is evaluated by 
ELISA test against the native antigens of SARS-CoV and for 
their capacity to neutralize the infectivity of SARS-CoV in 
vitro, using the methodologies described above. The pro- 
tective power of the response will be evaluated by measur- 
ing the reduction in the pulmonary viral load 2 days after a 
nonlethal challenge infection with SARS-CoV. 

[0704] Second generation recombinant measles viruses 
are constructed by substituting the wild-type sequences of 
the S and Sol genes by synthetic genes optimized for 
expression in mammalian cells, described in example 15 
above. These recombinant measles viruses are capable of 
expressing larger quantities of the S and Ssol antigens and 
therefore of exhibiting increased immunogenicity. 

[0705] Alternatively, the wild-type or synthetic genes 
encoding the S protein or the Ssol polypeptide may be 
inserted into the measles vector MVSchw-ATU3 in the form 
of an additional transcription unit located between the H and 
L genes, and then the recombinant viruses produced and 
characterized in a similar manner. This insertion is capable 
of generating recombinant viruses possessing different char- 
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acteristics (multiplication of the virus, level of expression of 
the transgene) and possibly an improved immunogenicity 
compared with those obtained after insertion of the trans- 
genes between the P and N genes. 

[0706] The recombinant measles virus MVSchw2-SARS- 
Ssol may be used for the quantitative production and the 
purification of the Ssol antigen for diagnostic and vaccine 
applications. 

EXAMPLE 18 
Other Applications Linked to the S Protein 
[0707] a) The lentiviral vectors allowing the expression of 
S or Ssol (or even of fragments of S) can constitute a 



recombinant vaccine against SARS-CoV, to be used in 
human or veterinary prophylaxis. In order to demonstrate 
the feasibility of such a vaccine, the immunogenicity of the 
recombinant lentiviral vectors TRIP-SD/SA-S-WPRE and 
TRIP-SD/SA-Ssol-WPRE is studied in mice. 
[0708] b) Monoclonal antibodies are produced with the aid 
of the recombinant Ssol polypeptide. According to the 
results presented in example 14 above, these antibodies or at 
least the majority of them will recognize the native form of 
the SARS-CoV S and will be capable of diagnostic and/or 
prophylactic applications. 

[0709] c) A serological test for SARS is developed with 
the Ssol polypeptide used as antigen and the double epitope 
methodology. 



SEQUENCE LISTING 

<160> NUMBER OF SEQ ID NOS : 158 

<210> SEQ ID NO 1 

<211> LENGTH: 29746 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 1 

atattaggtt tttacctaoc caggaaaagc oaacoaaoct ogatctcttg tagatctgtt 60 

ototaaacga actttaaaat otgtgtagot gtcgctcggc tgcatgccta gtgcaootac 120 

goagtataaa oaataataaa ttttactgto gttgaoaaga aaogagtaac tcgtoootct 180 

tctgcagact gottaoggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 240 

gtccgggtgt gaccgaaagg taagatggag agocttgttc ttggtgtcaa cgagaaaaca 300 

cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg 360 

gaotctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt 420 

otagtagagc tggaaaaagg cgtactgcoo cagcttgaac agccctatgt gttcattaaa 480 

cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 540 

gaoggcattc agtacggtcg tagoggtata acactgggag tactcgtgcc acatgtgggc 600 

gaaaccocaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 660 

ggtcatagct atggcatoga tctaaagtct tatgaottag gtgacgagct tggcactgat 720 

cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 7 80 

ctcactcgtg agctcaatgg aggtgcagtc actcgctatg tcgacaacaa tttctgtggc 840 

ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 900 

gacoatgagc atgaaattgc otggttcact gagcgctctg ataagagota cgagcaccag 1020 

acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 1080 

tttgtgtttc ctcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag 1140 

actgagggtt taatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 1200 

aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 1260 

acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa 1320 
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-continue d 

ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc 13B0 

tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 1440 

attgaaactc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc 1500 

tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 1560 

tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 1620 

atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 1680 

gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 1740 

agtcttgatt acaagtcttt caaaaccatt gttgagtcot gcggtaacta taaagttaco 1800 

aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca 1860 

ctgtgtggtt ttccctcaca ggctgctggt gttatcagat caatttttgc gcgcacactt 192 0 

gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 1980 

atttctgaao agtcattacg tcttgtcgac gccatggttt ataottoaga octgctcacc 2040 

aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 2100 

ttgtctaatc ttttgggoac tactgttgaa aaactcaggc ctatctttga atggattgag 2160 

gcgaaactta gtgcaggagt tgaatttctc aaggatgott gggagattct caaatttctc 2220 

attacaggtg tttttgacat cgtcaagggt caaatacagg ttgottcaga taacatoaag 2280 

gattgtgtaa aatgcttoat tgatgttgtt aaoaaggcac togaaatgtg cattgatcaa 2340 

gtoaotatcg otggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 2400 

agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct 2460 

cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc 2520 

tctgaggagg ttgttctcaa gaacggtgaa ctogaagcao tcgagacgcc ogttgatago 2580 

ttcaoaaatg gagctatcgt tggcacacca gtctgtgtaa atggcctcat gctcttagag 2640 

attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 2700 

tttcgcttaa aagggggtgc accaattaaa ggtgtaacot ttggagaaga tactgtttgg 2760 

gaagttcaag gttacaagaa tgtgagaato aoatttgagc ttgatgaacg tgttgacaaa 2820 

gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt 2880 

gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc 2940 

aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct 3000 

gaggacgatg cagagtgtga ggaagaagaa attgatgaaa octgtgaaca tgagtacggt 3120 

acagaggatg attatcaagg tctcoctctg gaatttggtg cctcagctga aacagttcga 3180 

gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag 3240 

ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt 3300 

aotgacaatg ttgooattaa atgtgttgac atogttaagg aggcacaaag tgctaatcct 3360 

atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agoaggtgca 3420 

ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat 3480 

ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 3540 

ctgcatgttg ttggaoctaa cctaaatgoa ggtgaggaca tccagcttct taaggcagca 3600 
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tatgaaaatt tcaattcaoa ggacatctta cttgcaccat tgttgtcagc aggcatattt 

ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 

attgcagtca atgacaaagc tctttatgag caggttgtca tggattatot tgataacctg 

gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa ggcctgcatt 

gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 

gotgatatca atggtaagct ttaccatgat totcagaaca tgcttagagg tgaagatatg 

totttoottg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatato 

acttgtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 

ttgaagaaag tgcoagttga tgagtatata accaogtacc ctggacaagg atgtgctggt 

tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 

ccttcagaag oacotaatgo taaggaagag attctaggaa ctgtatcctg gaatttgaga 

gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 

gcoataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 

gactatggtg tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 

aagctgaact otctaaatga gcogcttgtc acaatgocaa ttggttatgt gacacatggt 

tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgo cgtagtgtca 

gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 

tctgaggagc actttgtaga aacagtttot ttggctggct cttacagaga ttggtcctat 

toaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 

cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 

ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 

aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt 

ocaaoatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 

aagactttct ttgtactaoo tagtgatgac acactacgta gtgaagcttt cgagtactac 

catactcttg atgagagttt tcttggtagg tacatgtctg ctttaaacca cacaaagaaa 5160 

tggaaatttc ctcaagttgg tggtttaact tcaattaaat gggctgataa caattgttat 5220 

ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt 5280 

caagaggctt attatagagc ccgtgctggt gatgotgcta acttttgtgc actcatactc 5340 

gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 5400 

ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 5460 

tatgataatc ttaagacagg tgtttcoatt ccatgtgtgt gtggtcgtga tgctacacaa 5580 

tatotagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 5640 

ttacagoaag gtacattctt atgtgcgaat gagtacaotg gtaactatca gtgtggtcat 5700 

tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag 5760 

atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca 5820 

accatcaagc ctgtgtcgta taaactcgat ggagttactt acacagagat tgaaccaaaa 5880 
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ttggatgggt attataaaaa ggataatgct tactatacag agcagcctat agaccttgta 5940 

ccaactcaao cattaccaaa tgcgagtttt gataatttca aactoacatg ttctaacaca 6000 

aaatttgctg atgatttaaa tcaaatgaoa ggcttcacaa agccagottc acgagagcta 6060 

tctgtcacat tcttcccaga ottgaatggc gatgtagtgg ctattgacta tagacactat 6120 

caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 6240 
acaaagccag tagatacttc aaattcattt gaagttotgg cagtagaaga cacacaagga 6300 

accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc 6420 

atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt 6480 

atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta 6540 

gccttaggtt taaaaacaat tgccactcat ggtattgctg oaattaatag tgttocttgg 6600 

agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 6660 

tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta 6720 

ttgttooaat tgtgtacttt tactaaaagt aocaattcta gaattagagc ttcactacct 67B0 

aoaaotattg otaaaaatag tgttaagagt gttgotaaat tatgtttgga tgcoggcatt 6840 

aattatgtga agtcaccoaa attttotaaa ttgttcaoaa togctatgtg gotattgttg 6900 

ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct 6960 
aattttggtg ctcottctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac 
gttactacta tggatttctg tgaaggttct tttcottgca gcatttgttt aagtggatta 

ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 
aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct 
agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca 
cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 
agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 

gtctatgcaa atggaggccg tggcttctgc aagactcaca attggaattg tctcaattgt 

gaoacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc 

cagtttaaaa gaooaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct 

gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 

ctgcctatta atgtcatagt ttttgatggc aagtocaaat gcgacgagtc tgcttctaag 

tctgottotg tgtactacag tcagctgatg tgocaaccta ttctgttgct tgaccaagct 

ottgtatcag acgttggaga tagtactgaa gtttocgtta agatgtttga tgcttatgtc 

gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaoa 8040 

gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 8100 

gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc 8160 
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aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 8220 

acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat 8280 

aaagactaca tgtctttatc tgaacagctg cgtaaacaaa ttcgtagtgc tgccaagaag 8400 

aacaacatac cttttagact aacttgtgct acaactagac aggttgtcaa tgtcataact 8460 

actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 8520 

gocaoattat tgtgcgttct tgctgcattg gtttgttata togttatgoc agtacataoa 8580 

ttgtoaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 8640 

gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 8700 

gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 8760 

gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 8820 

gcaatcaatg gtgaottctt gcattttota cctcgtgttt ttagtgctgt tggcaaoatt 8880 

tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 894 0 

gctgctgagt gtacaatttt taaggatgot atgggoaaac ctgtgcoata ttgttatgac 9000 

actaatttgc tagagggtto tatttottat agtgagottc gtcoagaoac tcgttatgtg 9060 

ottatggatg gttccatcat acagtttcct aaoacttacc tggagggttc tgttagagta 9120 

gtaacaaott ttgatgotga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 9180 

atttgcctat ctacoagtgg tagatgggtt cttaataatg ageattacag agctctatca 9240 

ggagttttct gtggtgttga tgcgatgaat otcatagcta aoatctttac tcotottgtg 9300 

caacctgtgg gtgotttaga tgtgtctgct toagtagtgg ctggtggtat tattgccata 9360 

ttggtgaott gtgctgcota ctactttatg aaattcagac gtgtttttgg tgagtacaac 9420 

catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta 9480 

coagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 9540 

ttcaccaatg atgtttcatt cttggctcac ottcaatggt ttgccatgtt ttctcctatt 9600 

gtgccttttt ggataaoagc aatctatgta ttctgtattt ctctgaagca ctgcoattgg 9660 

ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 9720 

gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 9780 

gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 9840 

tatttoagtg gagccttaga tactaccago tatcgtgaag oagcttgctg coacttagca 9900 

aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc aocacagaca 9960 

tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 10020 

gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 10080 

gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct 10140 

aactatgaag atctgotoat tcgcaaatcc aaccatagct ttcttgttca ggctggcaat 10200 

gttcaacttc gtgttattgg ccattotatg caaaattgto tgcttaggct taaagttgat 10260 

acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt 10320 

tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct 10380 

aatcatacca ttaaaggttc tttoottaat ggatcatgtg gtagtgttgg ttttaacatt 10440 
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gattatgatt gcgtgtcttt ctgctatatg oatcatatgg agcttccaac aggagtacac 10500 

gctggtactg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag 10560 

gotgcaggta cagacacaac cataacatta aatgttttgg oatggotgta tgctgctgtt 10620 

atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt 10680 

gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat attgggacct 10740 

ctttctgctc aaacaggaat tgccgtctta gatatgtgtg ctgctttgaa agagctgctg 10800 

cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca 10860 

ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt 10920 

gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcfctgtt 10980 

oaaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact 11040 

ottggtatta tggcaattgo tgoatgtgot atgctgottg ttaagcataa gcaogcattc 11100 

ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg 11160 

cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct 11220 

ggttataggc ttaaggattg tgttatgtat gcttcagctt tagttttgct tattctcatg 11280 

acagctcgca ctgtttatga tgatgctgct agaogtgttt ggacactgat gaatgtcatt 11340 

acaottgttt acaaagtcta ctatggtaat gotttagatc aagotattto oatgtgggcc 11400 

ttagttattt ctgtaacctc taactattct ggtgtcgtta ogaotatoat gtttttagot 11460 

agagctatag tgtttgtgtg tgttgagtat taoocattgt tatttattac tggcaacaoc 11520 

ttacagtgta tcatgottgt ttattgtttc ttaggotatt gttgctgctg ctaotttggc 11580 

cttttctgtt taotcaaccg ttacttcagg ottactcttg gtgtttatga ctacttggto 11640 

tctaoaoaag aatttaggta tatgaaotoo caggggcttt tgoctcotaa gagtagtatt 11700 

gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt 11760 

gctactgtac agtctaaaat gtctgacgta aagtgcacat ctgtggtact gctctcggtt 11820 

cttcaacaac ttagagtaga gtcatcttct aaattgtggg cacaatgtgt acaactccac 11880 

aatgatattc ttcttgcaaa agaoaoaaot gaagotttcg agaagatggt ttctcttttg 11940 

tctgttttgc tatccatgca gggtgctgta gacattaata ggttgtgcga ggaaatgctc 12000 

gataaccgtg ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc 12060 

gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc 12120 

gttctcaaaa agttaaagaa atctttgaat gtggctaaat otgagtttga ocgtgatgct 12180 

gccatgcaac gcaagttgga aaagatggca gatcaggcta tgacccaaat gtacaaacag 12240 

gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact 12300 

atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgc gcgtgatggt 12360 

gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc 12480 

tgggaaatoc agcaagttgt tgatgoggat agcaagattg ttcaacttag tgaaattaac 12540 

atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca 12600 

gctgttaaac tacagaataa tgaaotgagt ccagtagcac tacgacagat gtcctgtgcg 12660 

gctggtacca cacaaacago ttgtactgat gacaatgoac ttgcctacta taacaattcg 12720 
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aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga 12780 

ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaocacc ttgtaggttt 12840 

gttacagaca caccaaaagg gcctaaagtg aaatacttgt aottcatcaa aggcttaaac 12900 

aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggctgga 12960 

aatgctacag aagtacctgc caattcaaot gtgctttcct tctgtgcttt tgcagtagac 13020 

octgctaaag catataagga ttacctagca agtggaggac aaccaatcao oaactgtgtg 13080 

aagatgttgt gtacacacac tggtacagga caggcaatta otgtaacacc agaagctaac 1314 0 

atggaccaag agtoctttgg tggtgcttoa tgttgtctgt attgtagatg ccaoattgac 13200 

catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat acctaccact 13260 

tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg 13320 

tggaaaggtt atggctgtag ttgtgaccaa ctcogcgaac ccttgatgca gtctgcggat 13380 

gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ocgtgoggca 13440 

caggcactag taotgatgtc gtctacaggg cttttgatat ttacaacgaa aaagttgctg 13500 

gttttgcaaa gttcctaaaa actaattgct gtcgottcca ggagaaggat gaggaaggca 13560 

atttattaga ctcttacttt gtagttaaga ggoataotat gtctaactac caacatgaag 13620 

agaotattta taaottggtt aaagattgto cagcggttgo tgtccatgao tttttcaagt 13680 

ttagagtaga tggtgacatg gtacoaoata tatcaogtoa gogtctaact aaatacaoaa 13740 

tggctgattt agtctatgct ctaogtcatt ttgatgaggg taattgtgat acattaaaag 13800 

aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 13860 

aottogtaga gaatoctgac atottacgcg tatatgctaa cttaggtgag cgtgtacgcc 13920 

aatoattatt aaagactgta caattctgog atgctatgcg tgatgcaggc attgtaggcg 13980 

tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 14040 

aagtagcaco aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 14100 

tcctoacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 14160 

cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 14220 

accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 14280 

ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 14340 

caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 14400 

ctggataoca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 14460 

cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt 14520 

ctggcaattt attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca 14580 

atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 14640 

tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 14700 

aggatggcaa cgctgctatc agtgattatg actattatcg ttataatotg ccaacaatgt 14760 

gtgatatcag acaactoota ttogtagttg aagttgttga taaatacttt gattgttacg 14820 

atggtggctg tattaatgcc aaccaagtaa tcgttaacaa tctggataaa tcagctggtt 14880 

tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 14940 

aagatgcact tttcgcgtat aotaagogta atgtcatoco tactataact caaatgaatc 15000 
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ttaagtatgc cattagtgca aagaatagag ctcgcaccgt agctggtgtc tctatctgta 15060 

gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccaotagag 15120 

gagctactgt ggtaattgga acaagcaagt tttacggtgg otggcataat atgttaaaaa 15180 

ctgtttaoag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 15240 

gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca 15300 

cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 15360 

gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 15420 

atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 15480 

taaatgcact tctttcaact gatggtaata agatagotga caagtatgtc cgcaatctac 15540 

aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 15600 

agttttacgc ttaootgogt aaacatttct ccatgatgat totttctgat gatgccgttg 15660 

tgtgctataa cagtaactat gcggctcaag gtttagtago tagcattaag aactttaagg 15720 

cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 15780 

accttactaa aggacctcac gaattttget cacagcatac aatgctagtt aaacaaggag 15840 

atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg 15900 

togatgatat tgtoaaaaca gatggtacao ttatgattga aaggttcgtg tcaotggota 15960 

ttgatgctta cccacttaca aaacatccta atoaggagta tgctgatgtc tttcacttgt 16020 

atttacaata cattagaaag ttacatgatg agottactgg ccacatgttg gacatgtatt 16080 

ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta 16140 

tgtaoaoaoo acataoagto ttgcaggctg taggtgcttg tgtattgtgc aattoaoaga 16200 

cttcacttcg ttgcggtgco tgtattagga gaccattcot atgttgcaag tgctgotatg 16260 

accatgtcat ttcaacatca caoaaattag tgttgtctgt taatccctat gtttgcaatg 16320 

ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt 16380 

gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 16440 

tatacaaaaa cacatgtgta ggcagtgaoa atgtcactga cttcaatgcg atagcaacat 16500 

gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaago 16560 

ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 16620 

ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 16680 

ctagaocacc attgaaaaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 16740 

aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 16800 

gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 16860 

taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 16920 

tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 16980 

tcggcatgca aaagtactct acactccaag gaccacctgg tactggtaag agtcattttg 17040 

ccatcggact tgctctctat tacccatctg ctogcatagt gtataoggca tgctctcatg 17100 

cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 17160 

gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 17220 

tagaaoagta tgttttctgc aotgtaaatg cattgccaga aacaactgct gacattgtag 17280 
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tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 17340 

gtgcaaaaca ctacgtotat attggcgatc ctgotcaatt accagccccc cgcacattgc 17400 

tgaotaaagg caoaotagaa ccagaatatt ttaattoagt gtgcagactt atgaaaacaa 17460 

tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct 17580 

tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 17640 

aaataggcgt tgtaagagaa tttcttacac gcaatcctgc ttggagaaaa gctgttttta 17700 

ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 17820 

cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 17880 

ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 17940 

taccacgtcg caatgtggct acattaoaag cagaaaatgt aactggactt tttaaggact 18000 

gtagtaagat cattactggt cttcatccta cacaggcacc tacacacctc agcgttgata 18060 

taaagttcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgaoct 18120 

accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttaoocta 18180 

atatgtttat oaooogcgaa gaagotattc gtoaogttcg tgcgtggatt ggctttgatg 18240 

tagagggctg tcatgcaaot agagatgotg tgggtaotaa cctacctoto oagctaggat 18300 

tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 18360 

cagaattcac cagagttaat gcaaaacctc caccaggtga ocagtttaaa catcttatac 18420 

cactcatgta taaaggottg ccctggaatg tagtgcgtat taagatagta caaatgctca 18480 

gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg 1854 0 

agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt tgtctgtgtg 18600 

acaaacgtgc aacttgottt tctacttcat oagatactta tgcctgctgg aatcattctg 18660 

tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg ggctttacgg 18720 

gtaaccttca gagtaaccat gaccaacatt gccaggtaca tggaaatgca catgtggota 18780 

gttgtgatgc tatcatgact agatgtttag cagtccatga gtgctttgtt aagcgcgttg 18840 

attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa 18900 

aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttcttcatg 18960 

aoattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct 19020 

acgatgctca gccatgtagt gaoaaagctt acaaaataga ggaactcttc tattcttatg 19080 

ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgatc 19140 

gttacccagc caatgoaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact 19200 

taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattccac actccagctt 19260 

tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat tctgatagtc 19320 

cttgtgagtc tcatggcaaa caagtagtgt cggatattga ttatgttcca ctcaaatctg 19380 

ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gcaaatgagt 19440 

accgacagta cttggatgca tataatatga tgatttctgc tggatttagc ctatggattt 19500 

acaaacaatt tgatacttat aacctgtgga atacatttac caggttacag agtttagaaa 19560 
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atgtggctta taatgttgtt aataaaggac actttgatgg acaogccggc gaagcacctg 19620 

tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg 19680 

aaccagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg 19800 

taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa 19860 

tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcaettaet gtcttgtttg 19920 

atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa 19980 

cagaaggtto agtoaaaggt ctaacacott caaagggacc agcacaagct agogtcaatg 2004 0 

gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg 20100 

gcattattca aoagttgcct gaaacctact ttaotcagag cagagactta gaggatttta 20160 

agcccagato acaaatggaa actgacttto tcgagctcgo tatggatgaa ttcatacagc 20220 

gatataagct cgagggctat gccttogaac acatcgttta tggagattto agtcatggac 20280 

aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta 20340 

aattagagga ttttatccct atggacagca oagtgaaaaa ttacttcata acagatgcgc 20400 

aaacaggttc atcaaaatgt gtgtgttctg tgattgatct tttacttgat gactttgtcg 20460 

agataataaa gtcacaagat ttgtcagtga tttoaaaagt ggtcaaggtt aoaattgact 20520 

atgctgaaat ttoattoatg otttggtgta aggatggaoa tgttgaaacc ttotacooaa 20580 

aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 20640 

aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa 20700 

aaggaataat gatgaatgtc gcaaagtata otoaaotgtg tcaatactta aatacactta 20760 

ctttagctgt accctacaac atgagagtta ttoaotttgg tgctggctct gataaaggag 20820 

ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 20880 

cagatcttaa tgaottcgtc tccgacgcag attctacttt aattggagac tgtgcaacag 20940 

tacatacggc taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 21000 

atgtgacaaa agagaatgao totaaagaag ggtttttcac ttatctgtgt ggatttataa 21060 

agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg 21120 

ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa 21180 

atgcatcatc atcggaagca tttttaattg gggotaacta tcttggcaag ccgaaggaac 21240 

aaattgatgg ctataccatg catgctaact acattttotg gaggaacaca aatcctatco 21300 

agttgtotto ctattcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg 21360 

ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt ctggaaaaag 21420 

gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt cttgttaaca 21480 

actaaacgaa catgtttatt ttcttattat ttcttactct cactagtggt agtgaccttg 21540 

tgaggggggt ttactatoot gatgaaattt ttagatoaga cactctttat ttaactcagg 21660 

atttatttct tccattttat tctaatgtta cagggtttca tactattaat catacgtttg 21720 

gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg 21780 

ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcaoagtcg gtgattatta 21840 
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ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gacaaccctt 21900 

tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat 21960 

ttaattgcac tttcgagtac atatctgatg ccttttcgct tgatgtttca gaaaagtcag 22020 

gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 22080 

ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga 22140 

aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc attcttacag 22200 

cottttcaoc tgctcaagac atttggggca cgtoagctgc agcctatttt gttggctatt 22260 

taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaoa gatgctgttg 22320 

attgttetca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca 223B0 

aaggaattta ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc 22440 

ctaatattac aaacttgtgt ccttttggag aggtttttaa tgotactaaa ttcccttctg 22500 

tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca 22560 

actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc 22620 

tttgcttotc caatgtctat gcagattctt ttgtagtcaa gggagatgat gtaagacaaa 22680 

tagcgccagg acaaactggt gttattgctg attataatta taaattgcoa gatgatttca 22740 

tgggttgtgt octtgcttgg aatactagga acattgatgc tacttcaaot ggtaattata 22800 

attataaata taggtatctt agacatggca agcttaggco ctttgagaga gaoatatcta 22860 

atgtgcottt otcccctgat ggcaaacctt gcaooccacc tgctcttaat tgttattggc 22920 

cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaaoct tacagagttg 22980 

tagtaottto ttttgaactt ttaaatgoao cggccacggt ttgtggacca aaattatooa 23040 

ctgaccttat taagaaccag tgtgtcaatt ttaattttaa tggactcact ggtactggtg 23100 

tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 23160 

atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct 23220 

cttttggggg tgtaagtgta attacacctg gaacaaatgc ttcatctgaa gttgotgtto 23280 

tatatcaaga tgttaactgc actgatgttt otacagcaat tcatgcagat caactcacac 23340 

cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta 23400 

taggagctga gcatgtcgao acttcttatg agtgcgacat tcctattgga gctggcattt 23460 

gtgctagtta ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt 23520 

ataotatgto tttaggtgct gatagttcaa ttgcttacto taataacacc attgctatac 23580 

ctaotaaott ttcaattagc attaotacag aagtaatgoc tgtttctatg gctaaaacct 23640 

ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc 23700 

atcgcaacac aogtgaagtg ttcgctcaag tcaaacaaat gtaoaaaacc ccaactttga 23820 

aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 23880 

ggtcttttat tgaggaottg ctctttaata aggtgacact cgctgatgct ggcttcatga 23940 

agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt gcgcagaagt 24000 

tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg 24060 

ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc 24120 
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aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg 24180 

ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 24240 

aagaatoaot tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga 24300 

atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa 24360 

gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 24420 

ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 24480 

ctgctgaaat cagggcttct gctaatcttg ctgotactaa aatgtctgag tgtgttettg 24540 

gacaatcaaa aagagttgac ttttgtggaa agggotaooa ccttatgtcc ttcccacaag 24600 

cagocccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact 24660 

tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgttt 24720 

ttgtgtttaa tggcacttct tggtttatta cacagaggaa cttcttttct ccacaaataa 24780 

ttaotaoaga caatacattt gtctoaggaa attgtgatgt cgttattggc atcattaaca 24840 

acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 24900 

acttcaaaaa tcatacatca ocagatgttg atcttggcga catttoaggc attaacgctt 24960 

ctgtcgtcaa cattoaaaaa gaaattgacc gootcaatga ggtcgctaaa aatttaaatg 25020 

aatoaotoat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt 25080 

atgtttggot oggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 25140 

gttgcatgao tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 25200 

agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa 25260 

cgaacttatg gatttgttta tgagattttt taotcttgga tcaattactg cacagccagt 25320 

aaaaattgac aatgottotc ctgcaagtao tgttcatgot aoagcaacga taccgctaca 25380 

agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag 25440 

cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 25500 

gttcatttgc aatttactgc tgctatttgt tacoatctat tcacatcttt tgcttgtcgc 25560 



tgoaggtatg gaggogcaat ttttgtacct ctatgccttg atatattttc tacaatgcat 25620 

caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 25680 

attactttat gatgccaact actttgtttg ctggcacaoa cataactatg actactgtat 25740 

accatataac agtgtcacag atacaattgt cgttac-tgaa ggtgacggca tttcaacacc 25800 

aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa 25860 

agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca 25920 

aattactaca gacactggta ttgaaaatgc tacattcttc atctttaaoa agcttgttaa 25980 

aatggatcca atttatgatg agccgacgac gactactago gtgcctttgt aagcacaaga 26100 

aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa 26160 

tagcgtactt otttttcttg ctttcgtggt attottgota gtcacactag ccatccttac 26220 

tgcgcttoga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac 26280 

ggtttacgtc tactcgcgtg ttaaaaatct gaactcttct gaaggagttc c-tgatcttct 26340 

ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg 26400 
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gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta 26460 

gtaataggtt toctattcct agcctggatt atgttactac aatttgccta ttctaatcgg 26520 

aacaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt 26580 

gcttgttttg tgcttgctgc tgtctacaga attaattggg tgactggcgg gattgcgatt 26640 

goaatggctt gtattgtagg cttgatgtgg cttagctact tcgttgottc cttcaggctg 26700 

tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 26760 

oototccggg ggacaattgt gaccagaoog otoatggaaa gtgaacttgt cattggtgct 26S20 

gtgatcattc gtggtcactt gogaatggoo ggacactccc tagggcgctg tgacattaag 26880 

gacctgccaa aagagatcac tgtggctaca tcacgaacgc tttcttatta caaattagga 26940 

gcgtcgcagc gtgtaggcac tgattcaggt tttgotgcat acaaccgcta ccgtattgga 27000 

aactataaat taaatacaga ccacgccggt agcaacgaca atattgcttt gotagtaoag 27060 

taagtgacaa cagatgtttc atcttgttga cttocaggtt acaatagcag agatattgat 27120 

tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat 27180 

agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag atgatgaaga 27240 

aootatggag ttagattatc oataaaacga acatgaaaat tattctcttc ctgacattga 27300 

ttgtatttac atcttgcgag ctatatcact atcaggagtg tgttagaggt aogactgtao 27360 

taotaaaaga aoottgccoa toaggaaoat acgagggoaa ttcacoattt oaccotottg 27420 

ctgacaataa atttgoacta aettgcacta goacacactt tgcttttgct tgtgotgaog 27480 

gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc accaaaactt ttoatcagac 27540 

aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct gctctagtat 27600 

ttttaatact ttgcttcacc attaagagaa agaoagaatg aatgagctca ctttaattga 27660 

cttetatttg tgctttttag cctttctgct attccttgtt ttaataatgc ttattatatt 27720 

ttggttttca ctcgaaatcc aggatctaga agaaccttgt accaaagtct aaacgaacat 27780 

gaaaottctc attgttttga cttgtatttc tctatgcagt tgcatatgoa ctgtagtaoa 27840 

gcgctgtgca tctaataaac ctcatgtgct tgaagatcct tgtaaggtac aacactaggg 27900 

gtaatactta tagcactgct tggctttgtg ctctaggaaa ggttttacct tttcatagat 27960 

ggcacactat ggttcaaaca tgcacaccta atgttactat caactgtcaa gatccagctg 28020 

gtggtgcgct tatagctagg tgttggtaco ttcatgaagg tcaccaaact gctgcattta 28080 

gagacgtaot tgttgtttta aataaacgaa caaattaaaa tgtotgataa tggaccccaa 28140 

tcaaaccaao gtagtgcccc ccgcattaca tttggtggac ccacagattc aactgacaat 28200 

aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgacccoa aggtttaccc 28260 

aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga acttagattc 28320 

cctcgaggcc agggcgttcc aatcaacacc aatagtggtc cagatgacca aattggctac 28380 

tacogaagag ctacccgacg agttcgtggt ggtgaoggoa aaatgaaaga gctcagcccc 28440 

agatggtact tctattacct aggaaotggo coagaagott cacttcccta cggcgctaac 28500 

aaagaaggca tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt 28560 

ggcacccgca atcotaataa caatgctgcc accgtgctac aacttcctca aggaacaaca 28620 

ttgccaaaag gcttctacgo agagggaagc agaggcggca gtcaagccto ttctcgctoo 28680 
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tcatcacgta gtcgcggtaa ttoaagaaat tcaactcctg gcagcagtag gggaaattct 28740 

cctgctcgaa tggctagcgg aggtggtgaa actgccctcg cgctattgct gctagacaga 28800 

ttgaaccagc ttgagagoaa agtttctggt aaaggccaac aacaacaagg ccaaactgtc 28860 

cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc 28980 

ggggaccaag acctaatcag aoaaggaact gattacaaac attggccgca aattgcacaa 29040 

tttgctccaa gtgcctctgc attctttgga atgtcacgca ttggcatgga agtcacacct 29100 

togggaacat ggctgactta tcatggagco attaaattgg atgaoaaaga tccacaattc 2916 0 

aaagacaacg tcatactgct gaacaagcac attgacgcat acaaaacatt cccaccaaca 29220 

gagcctaaaa aggacaaaaa gaaaaagaot gatgaagcto agcctttgcc goagagacaa 29280 

aagaagcagc ocactgtgao tcttcttcct gcggctgaca tggatgattt ctccagacaa 29340 

cttcaaaatt ccatgagtgg agcttctgct gattcaactc aggcataaac actcatgatg 29400 

accacacaag gcagatgggc tatgtaaacg ttttcgcaat tccgtttacg atacatagtc 29460 

tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtaggttta gttaacttta 29520 

atcttta atcaatgtgt aacattaggg aggacttgaa agagccacca 29580 

c gaggccacgc ggagtaogat ogagggtaoa gtgaataatg ctagggagag 29640 

otgoctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg 29700 



> ORGANISM: CORONAVIRUS 



<400> 

ttctcttctg gaaaaaggta ggottatcat tagagaaaao aacagagttg tggtttcaag 

tgatattctt gttaacaact aaacgaac atg ttt att ttc tta tta ttt ctt 
Met Phe lie Phe Leu Leu Phe Leu 



agt ggt agt gac ctt gac egg tgc acc act 
Ser Gly Ser Asp Leu Asp Arg Cys Thr Thr 



Val Gin Ala Pro Asn Tyr Thr Gin His T 



Tyr Tyr Pro Asp Glu lie Phe Arg Ser Asp Thr L 
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Phe Asp Asn Ala Phe Asn Cys Thr Phe Gin Tyr lie S 



Phe Val Phe Lys A 



Lys Gly lie Tyr Gin Thr S 



gat gta aga caa ata gcg cca gga caa act ggt gtt att get gat tat 1312 
Asp Val Arg Gin lie Ala Pro Gly Gin Thr Gly Val He Ala Asp Tyr 
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n Cys Tyr Trp Pro Leu Asn Asp Tyr Gly Phe Tyr Thr T 



a Phe Asn Gly Leu Thr Gly Thr Gly 



e Thr Asp Ser Val Arg Asp Pro Lys 1 



Thr Pro Giy Thr Asn Ala Ser Ser Giu Val Ala Val Leu Tyr Gin Asp 



3 He Gly Ala Gly He Cys Ala 



Pro Thr Asn Phe Ser He Ser He Thr Thr Glu Val Met Pro Val £ 
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Asp lie Asn Ala Arg Asp Leu lie Cys Ala Gin Lys E 



Asn Gly lie Gly Val Thr Gin Asn Val Leu Tyr Glu Asn Gin Lye 
890 895 900 

ate gee aac caa ttt aac aag gcg att agt caa att caa gaa tea 
lie Ala Asn Gin Phe Asn Lys Ala lie Ser Gin lie Gin Glu Ser 



a Leu Asn Thr Leu Val Lys Gin Leu S 
a agt gtg eta aat g 



Ser Leu Gin Thr Tyr Val Thr Gin Gin Leu lie Arg Ala Ala Glu He 



a Ala Ala Thr Lys 
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Leu Gly Gin Ser Lys Arg Val Asp Phe Cys Gly Lys Gly Tyr His 
1020 1025 ' 1030 

ctt atg toe ttc cca caa gca gec ccg cat ggt gtt gtc ttc eta 
Leu Met Ser Phe Pro Gin Ala Ala Pro His Gly Val Val Phe Leu 
1035 1040 1045 

cat gtc acg tat gtg cca tec cag gag agg aac ttc acc aca gcg 
His Val Thr Tyr Val Pro Ser Gin Glu Arg Asn Phe Thr Thr Ala 
1050 1055 1060 

cca gca att tgt cat gaa ggc aaa gca tac ttc cct cgt gaa ggt 
Pro Ala He Cys His Glu Gly Lys Ala Tyr Phe Pro Arg Glu Gly 
1065 1070 1075 

gtt ttt gtg ttt aat ggc act tct tgg ttt att aca cag agg aac 
Val Phe Val Phe Asn Gly Thr Ser Trp Phe He Thr Gin Arg Asn 
1080 1085 1090 



o Gin He He 



Gly Asn Cys Asp Val 



gat cct ctg caa cct gag ctt gac tea ttc aaa gaa gag ctg gac 
Asp Pro Leu Gin Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp 
1125 1130 1135 

aag tac ttc aaa aat cat aca tea cca gat gtt gat ctt ggc gac 
Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp 
1140 1145 1150 

att tea ggc att aac got tct gtc gtc 



t gag gte get aaa aat tta aat gaa tea etc att 
n Glu Val Ala Lys Asn Leu Asn Glu Ser Leu He 
1170 1175 1180 



1195 



tgg tat gtt tgg etc ggc ttc att get gga eta att gee ate 

Trp Tyr Val Trp Leu Gly Phe He Ala Gly Leu He Ala He 
1200 1205 

atg gtt aca ate ttg ctt tgt tgo atg act agt tgt tgc agt 

Met Val Thr He Leu Leu Cys Cys Met Thr Ser Cys Cys Ser 
1215 1220 



gat gac tct gag oca gtt etc aag ggt gtc aaa tta cat tac aca 3853 
Asp Asp Ser Glu Pro Val Leu Lys Gly Val Lys Leu His Tyr Thr 
1245 1250 1255 

taaacgaact tatggatttg tttatgagat tttttactct tggatcaatt actgeacage 3913 

cagtaaaaat tgacaatget tetcetgeaa gt 3945 



<210> SEQ ID NO 3 

<211> LENGTH: 1255 

<212> TYPE: PRT 

<213> ORGANISM: CORONAVIRUS 
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<400> SEQUENCE : 3 

Met Phe lie Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 

Asp Arg Cys Thr Thr Phe Asp Asp Val Gin Ala Pro Asn Tyr Thr Gin 

His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu He Phe Arg 
35 40 45 

Ser Asp Thr Leu Tyr Leu Thr Gin Asp Leu Phe Leu Pro Phe Tyr Ser 
50 55 60 

Asn Val Thr Gly Phe His Thr He Asn His Thr Phe Gly Asn Pro Val 
65 70 75 80 

He Pro Phe Lys Asp Gly He Tyr Phe Ala Ala Thr Glu Lys Ser Asn 
85 90 95 

Val Val Arg Gly Trp Val Phe Gly Ser Thr Met Asn Asn Lys Ser Gin 
100 105 110 

Ser Val He He He Asn Asn Ser Thr Asn Val Val He Arg Ala Cys 
115 120 125 

Asn She Glu Leu Cys Asp Asn Pro Phe Phe Ala Val Ser Lys Pro Met 
130 135 140 

Gly Thr Gin Thr His Thr Met He Phe Asp Asn Ala Phe Asn Cys Thr 
145 150 155 160 

Phe Glu Tyr lie Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser 
165 170 175 

Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys Asp Gly 
180 185 190 

Phe Leu Tyr Val Tyr Lys Gly Tyr Gin Pro He Asp Val Val Arg Asp 
195 200 205 

Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro He Phe Lys Leu Pro Leu 
210 215 220 

Gly He Asn He Thr Asn Phe Arg Ala He Leu Thr Ala Phe Ser Pro 
225 230 235 240 

Ala Gin Asp He Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 
245 250 255 

Leu Lys Pro Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr He 
260 265 270 

Thr Asp Ala Val Asp Cys Ser Gin Asn Pro Leu Ala Glu Leu Lys Cys 
275 280 285 

Ser Val Lys Ser Phe Glu He Asp Lys Gly He Tyr Gin Thr Ser Asn 

Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn He Thr 
305 310 315 320 

Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 
325 330 335 

Val Tyr Ala Trp Glu Arg Lys Lys He Ser Asn Cys Val Ala Asp Tyr 
340 345 350 

Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 
355 360 365 

Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 
370 375 380 

Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gin He Ala Pro Gly 
385 390 395 400 
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Gin Thr Gly Val lie Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 

405 410 415 

Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn lie Asp Ala Thr Ser 
420 425 430 

Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 

Arg Pro Phe Glu Arg Asp lie Ser Asn Val Pro Phe Ser Pro Asp Gly 

Lys Pro Cys Thr Pro Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 

465 470 475 480 

Tyr Gly Phe Tyr Thr Thr Thr Gly lie Gly Tyr Gin Pro Tyr Arg Val 

485 490 495 

Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 
500 505 510 

Pro Lys Leu Ser Thr Asp Leu lie Lys Asn Gin Cys Val Asn Phe Asn 
515 520 525 

Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 

530 535 540 

Phe Gin Pro Phe Gin Gin Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 

545 550 555 560 

Ser Val Arg Asp Pro Lys Thr Ser Glu lie Leu Asp lie Ser Pro Cys 

565 570 575 

Ser Phe Gly Gly Val Ser Val lie Thr Pro Gly Thr Asn Ala Ser Ser 
580 585 590 

Glu Val Ala Val Leu Tyr Gin Asp Val Asn Cys Thr Asp Val Ser Thr 

Ala lie His Ala Asp Gin Leu Thr Pro Ala Trp Arg lie Tyr Ser Thr 

610 615 620 

Gly Asn Asn Val Phe Gin Thr Gin Ala Gly Cys Leu lie Gly Ala Glu 

625 630 635 640 

His Val Asp Thr Ser Tyr Glu Cys Asp He Pro He Gly Ala Gly He 



Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gin Lys 

660 665 670 

Ser He Val Ala Tyr Thr Met Ser Leu Gly Ala Asp Ser Ser He Ala 
675 680 685 

Tyr Ser Asn Asn Thr He Ala He Pro Thr Asn Phe Ser He Ser He 

690 695 700 

Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 
705 710 715 720 

Asn Met Tyr He Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 
725 730 735 

Gin Tyr Gly Ser Phe Cys Thr Gin Leu Asn Arg Ala Leu Ser Gly He 

740 745 750 

Ala Ala Glu Gin Asp Arg Asn Thr Arg Glu Val Phe Ala Gin Val Lys 
755 760 765 

Gin Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 



Ser Gin He Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg S 
785 790 795 
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3 Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 
805 810 815 

l Tyr Gly Glu Cys Leu Gly Asp lie Asn Ala Arg Asp Leu lie 
820 825 830 

i Gin Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 
835 840 845 

> Met lie Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 
) 855 860 

i Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gin lie Pro Phe 

i Tyr Glu Asn Gin Lys Gin lie Ala Asn Gin Phe Asn Lys Ala 
900 905 910 

■ Gin lie Gin Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 
915 920 925 

i Gin Asp Val Val Asn Gin Asn Ala Gin Ala Leu Asn Thr Leu 
I 935 940 

i Gin Leu Ser Ser Asn Phe Gly Ala lie Ser Ser Val Leu Asn 
950 955 960 

! Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gin lie Asp 
965 970 975 

i He Thr Gly Arg Leu Gin Ser Leu Gin Thr Tyr Val Thr Gin 

i He Arg Ala Ala Glu He Arg Ala Ser Ala Asn Leu Ala Ala 
995 1000 1005 

. Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gin Ala Ala 
:5 1030 

. Thr Tyr Val Pro Ser G 
050 



n Phe Thr Thr Ala 



Ala Tyr Phe Pro Arg G 



Pro Ala He Cys 
Val Phe Val Phe 



065 



Asn He Gin Lys 



r Gin Arg Asn Phe Phe Ser Pro Gil 
1090 

r Phe Val Ser Gly Asn Cys Asp 

u Glu Leu Asp Lys Tyr Phe Lys 
1135 

p Leu Gly Asp He Ser Gly He 
1150 

Asp Arg Leu Asn Glu V£ 
'170 



185 

Tyr He Lys Trp Pro Trp Tyr Val Trp Leu G. 



Gly Lys 
He Thr 
Leu Asp 



y L y s T y r 

Phe He 
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gacoaaaatt atccactgac cttattaaga accagtgtgt caattttaat tttaatggac 
tcactggtac tggtgtgtta actccttctt caaagagatt tcaaccattt caacaatttg 
gccgtgatgt ctctgattto actgattccg ttcgagatcc taaaacatct gaaatattag 



ctgaagttgc tgttctatat caagatgtta actgcactga tgtttotaoa gcaatccat 

oagatcaact cacaccagct tggogcatat attotactgg aaacaatgta ttccagact 

aagcaggctg tcttatagga gctgagcatg tcgacacttc ttatgagtgc gacattccta 2040 

ttggagctgg catttgtgct agttaccata cagtttcttt attaogtagt actagccaaa 2100 

aatctattgt ggcttatact atgtctttag gtgctgatag ttcaattgct tactctaata 2160 

acaccattgc tatacctact aacttttcaa ttagcattac tacagaagta atgcctgttt 2220 

ctatggotaa aacctoogta gattgtaata tgtacatctg cggagattct actgaatgtg 2280 

ctaatttgct tctcoaatat ggtagctttt goacacaact aaatcgtgca ctctcaggta 2340 

ttgctgctga acaggatcgc aacacacgtg aagtgttcgc tcaagtcaaa caaatgtaca 2400 

aaaccccaac tttgaaatat tttggtggtt ttaatttttc acaaatatta cctgaccctc 2460 

atgotggott catgaagcaa tatggogaat gcctaggtga tattaatgot agagatotca 2580 

tttgtgogca gaagttcaat gggcttacag tgttgccacc totgctoaot gatgatatga 2640 

ttgctgccta cactgctgct ctagttagtg gtactgccac tgctggatgg acatttggtg 2700 

ctggcgctgc tottcaaata ccttttgcta tgcaaatggc atataggttc aatggcattg 2760 

gagttaccca aaatgttcto tatgagaaco aaaaacaaat ogocaacoaa tttaacaagg 2820 

acgttgttaa ccagaatgct caagcattaa acacacttgt taaacaactt agctctaatt 2940 

ttggtgcaat ttcaagtgtg ctaaatgata tcctttcgcg acttgataaa gtcgaggcgg 3000 

aggtacaaat tgacaggcta attacaggca gacttcaaag ccttcaaacc tatgtaacac 3060 



aacaactaat cagggotgct gaaatcaggg ottotgctaa tcttgotgct actaaaatgt 3120 

ctgagtgtgt tcttggacaa tcaaaaagag ttgacttttg tggaaagggc taccacctta 3180 

tgtccttccc acaagcagcc ccgcatggtg ttgtcttcct acatgtcacg tatgtgccat 3240 

ctcgtgaagg tgtttttgtg tttaatggca cttcttggtt tattacacag aggaacttct 3360 

tttctccaca aataattact acagacaata catttgtctc aggaaattgt gatgtcgtta 3420 

ttggcatcat taacaacaca gtttatgatc ctctgcaacc tgagcttgac tcattcaaag 3480 

ttaaatggoc ttggtatgtt tggctcggct tcattgctgg actaattgcc atcgtcatgg 3720 

ttacaatctt gctttgttgc atgactagtt gttgcagttg cctcaagggt gcatgctctt 3780 

gtggttcttg ctgcaagttt gatgaggatg actctgagcc agttctcaag ggtgtcaaat 3840 

tacattacac ataaacgaac ttatggattt gtttatgaga ttttttactc ttggatcaat 3900 
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tactgcacag ccagtaaaaa ttgacaatgc ttctcctgca agt 3943 




<213> ORGANISM: COROHAVIRUS 
<400> SEQUENCE: S 

ctcttctgga aaaaggtagg cttatcatta gagaaaacaa cagagttgtg gtttcaagtg 60 

atattcttgt taaoaactaa acgaaoatgt ttattttctt attatttctt actctcacta 120 

gtggtagtga ccttgaccgg tgcaccactt ttgatgatgt tcaagctcct aattacactc 1B0 

aacatacttc atctatgagg ggggtttact atcctgatga aatttttaga tcagacactc 240 

tttatttaac tcaggattta tttottccat tttattctaa tgttaoaggg tttcatacta 300 

ttaatcatac gtttggcaac cctgtcatao cttttaagga tggtatttat tttgctgcca 360 

oagagaaatc aaatgttgtc cgtggttggg tttttggtto taccatgaac aacaagtcac 420 

agtcggtgat tattattaac aattctacta atgttgttat acgagcatgt aactttgaat 480 

tgtgtgacaa ccctttcttt gctgtttcta aacccatggg tacacagaca catactatga 540 

tattcgataa tgoatttaat tgcactttcg agtacatatc tgatgccttt tcgottgatg 600 

tttcagaaaa gtcaggtaat tttaaacact taogagagtt tgtgtttaaa aataaagatg 660 

ggtttctcta tgtttataag ggctatcaac ctatagatgt agttogtgat ctaccttctg 720 

gttttaacac tttgaaaoct atttttaagt tgcctcttgg tattaacatt acaaatttta 780 

gagccattct tacagccttt tcacctgctc aagaoatttg gggcacgtca gctgcagcct 840 

attttgttgg ctatttaaag ocaactacat ttatgctcaa gtatgatgaa aatggtacaa 900 

tcacagatgc tgttgattgt totcaaaatc cacttgctga aotcaaatgc tctgttaaga 960 

gctttgagat tgacaaagga atttaccaga cotctaattt cagggttgtt ccctcaggag 1020 

atgttgtgag attccctaat attacaaact tgtgtccttt tggagaggtt tttaatgcta 1080 

ctaaattccc ttctgtctat gcatgggaga gaaaaaaaat ttctaattgt gttgctgatt 1140 

actctgtgct ctacaactca acattttttt caacctttaa gtgotatggc gtttctgcca 1200 

ctaagttgaa tgatctttgc ttctccaatg tctatgcaga ttcttttgta gtcaagggag 1260 

atgatgtaag aoaaatagcg ccaggacaaa ctggtgttat tgctgattat aattataaat 132 0 

tgccagatga tttcatgggt tgtgtccttg cttggaatac taggaaoatt gatgctaott 1380 

caaotggtaa ttataattat aaatataggt atcttagaca tggcaagott aggccctttg 1440 

agagagacat atctaatgtg octttctcoc ctgatggcaa acottgcacc ccacctgctc 1500 

ttaattgtta ttggccatta aatgattatg gtttttacac cactactggc attggctacc 1560 

tcactggtao tggtgtgtta actccttctt caaagagatt tcaaccattt caacaatttg 1740 

gcogtgatgt ctctgatttc actgattccg ttcgagatcc taaaacatct gaaatattag 1800 

acatttcacc ttgctctttt gggggtgtaa gtgtaattac acctggaaca aatgcttcat 1860 

ctgaagttgc tgttctatat caagatgtta actgcactga tgtttctaca gcaatccatg 1920 

cagatcaact oacaocagct tggcgcatat attctactgg aaacaatgta ttccagaotc 1980 
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aagcaggotg tcttatagga gctgagcatg tcgacacttc ttatgagtgc gacattccta 2040 

ttggagctg 2049 

<210> SEQ ID NO 6 
<211> LENGTH : 2027 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 6 

catgcagatc aactcacacc agcttggcgo atatattcta ctggaaacaa tgtattccag 60 

actcaagcag gctgtcttat aggagctgag catgtcgaca cttcttatga gtgcgacatt 120 

cctattggag ctggcatttg tgctagttac catacagttt ctttattacg tagtactagc 180 

caaaaatcta ttgtggctta tactatgtct ttaggtgctg atagttcaat tgcttactct 24 0 

aataacacca ttgctatacc tactaacttt tcaattagca ttactacaga agtaatgcct 300 

gtttctatgg ctaaaacctc cgtagattgt aatatgtaca totgcggaga ttctactgaa 360 

tgtgctaatt tgcttctcca atatggtagc ttttgcacac aactaaatcg tgcactctca 420 

ggtattgctg ctgaacagga tcgcaacaca cgtgaagtgt togctcaagt caaacaaatg 480 

tacaaaaccc caactttgaa atattttggt ggttttaatt tttcacaaat attaqctgac 540 

cctctaaago oaactaagag gtcttttatt gaggacttgc tctttaataa ggtgacactc 600 

gctgatgctg gcttoatgaa gcaatatggc gaatgoctag gtgatattaa tgctagagat 660 

ctcatttgtg cgoagaagtt caatgggott acagtgttgc cacctctgct cactgatgat 720 

atgattgctg cctacactgc tgctctagtt agtggtaotg ccactgctgg atggacattt 780 

ggtgctggcg ctgotottca aataoctttt gctatgcaaa tggcatatag gttcaatggc 840 

aaggcgatta gtcaaattca agaatoactt acaacaacat caactgcatt gggcaagctg 960 
caagacgttg ttaaccagaa tgctcaagca ttaaacacac ttgttaaaca acttagctct 
aattttggtg caatttcaag tgtgctaaat gatatccttt cgcgacttga taaagtcgag 



gcggaggtac aaattgacag gttaattaca ggcagacttc aaagcottca aacctatgta 
acacaacaac taatcagggc tgctgaaatc agggcttctg ctaatcttgc tgctactaaa 
atgtctgagt gtgttcttgg acaatcaaaa agagttgact tttgtggaaa gggctaccac 
cttatgtcct taccacaagc agccccgcat ggtgttgtct tcctacatgt cacgtatgtg 
ccatcccagg agaggaactt cacoacagcg coagcaattt gtcatgaagg caaagcatac 
ttcootcgtg aaggtgtttt tgtgtttaat ggcacttctt ggtttattac acagaggaac 
ttcttttctc cacaaataat tactacagac aatacatttg tctcaggaaa ttgtgatgtc 

aaagaagagc tggacaagta cttoaaaaat catacatcac cagatgttga tcttggcgac 

atttcaggca ttaacgcttc tgtcgtcaac attcaaaaag aaattgaccg cctcaatgag 

gtcgctaaaa atttaaatga atcaotoatt gaccttcaag aattgggaaa atatgagoaa 

tatattaaat ggccttggta tgtttggctc ggcttcattg ctggactaat tgccatogtc 

atggttacaa tcttgctttg ttgcatgact agttgttgca gttgcctcaa gggtgcatgc 

tottgtggtt cttgctgcaa gtttgatgag gatgactctg agccagttct caagggtgtc 
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aaattacatt acacataaac gaacttatgg atttgtttat gagatttttt actcttggat 1980 
caattactgc acagccagta aaaattgaca atgcttctcc tgoaagt 2027 




<213> ORGANISM: CORONAVIRUS 
<400> SEQUENCE: 7 

tcttgctttg ttgcatgact agttgttgca gttgootcaa gggtgcatgc tcttgtggtt 60 

cttgctgcaa gtttgatgag gatgactctg agccagttct oaagggtgtc aaattacatt 120 

acacataaac gaacttatgg atttgtttat gagatttttt actcttggat caattactgc 1B0 

acagccagta aaaattgaca atgcttctcc tgcaagtact gttcatgcta cagcaacgat 240 

accgctacaa gcctcactcc ctttcggatg gcttgttatt ggcgttgcat ttcttgctgt 300 

ttttcagagc gctaccaaaa taattgcgct caataaaaga tggcagctag ccctttataa 360 

gggcttccag ttcatttgca atttactgct gctatttgtt accatctatt cacatctttt 420 

gcttgtcgct gcaggtatgg aggcgcaatt tttgtacctc tatgccttga tatattttct 480 

acaatgcatc aacgcatgta gaattattat gagatgttgg ctttgttgga agtgcaaatc 540 

ttcaacacca aaactcaaag aagactacca aattggtggt tattctgagg ataggcactc 720 

aggtgttaaa gactatgtcg ttgtacatgg ctatttcacc gaagtttact accagottga 780 

gcttgttaaa gacccaccga atgtgcaaat acacacaatc gacggctctt caggagttgc 900 

taatccagca atggatccaa tttatgatga gccgacgacg actactagcg tgcctttgta 960 

agcacaagaa agtgagtacg aacttatgta ctcattcgtt tcggaagaaa caggtacgtt 1020 

aatagttaat agcgtacttc tttttcttgc tttcgtggta ttcttgctag tcacactagc 1080 

catccttact gcgctt 109 6 

<210> SEQ ID NO 8 

<211> LENGTH: 1135 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 8 

attgccatcg tcatggttac aatcttgctt tgttgcatga ctagttgttg cagttgcctc 60 

aagggtgcat gctcttgtgg ttcttgctgc aagtttgatg aggatgactc tgagccagtt 120 

ttactcttgg atcaattact gcacagccag taaaaattga caatgcttct cctgcaagta 240 

ctgttcatgc tacagcaacg ataccgctac aagcctcact ccctttcgga tggcttgtta 300 

ttggcgttgc atttcttgct gtttttcaga gcgctaccaa aataattgcg ctcaataaaa 360 

gatggcagct agccctttat aagggcttcc agttcatttg caatttactg ctgctatttg 420 

ttaccatcta ttcacatctt ttgcttgtcg ctgcaggtat ggaggcgcaa tttttgtacc 480 

tctatgcctt gatatatttt ctacaatgca tcaacgcatg tagaattatt atgagatgtt 540 
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a tccaagaacc cattacttta tgatgocaac tactttgttt 
b gactactgta taccatataa cagtgtcaca gatacaattg 
o atttcaacac caaaactcaa agaagactac caaattggtg 



ccgaagttta ctaccagctt gagtctacac aaattactac agaoactggt attgaaaatg 
ctacattctt catctttaac aagcttgtta aagacccacc gaatgtgoaa atacacacaa 
tcgacggctc ttoaggagtt gctaatccag caatggatoc aatttatgat gagccgacga 

tttcggaaga aacaggtacg ttaatagtta atagcgtact tctttttctt gctttcgtgg 
tattcttgot agtcacacta gccatcctta ctgcgcttcg attgtgtgcg tactg 

<210> SEQ ID NO 9 

<211> LENGTH: 1096 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 



<222> 

<223> OTHER INFORMATION: 
<400> SEQUENCE: 9 

tcttgctttg ttgcatgact agttgttgca g 
cttgctgcaa gtttgatgag gatgactctg a 
aoacataaao gaactt atg gat ttg ttt 



e Phe Thr Leu Gly S 



r He Pro Leu Gin Ala S 



Lys He He Ala Leu Asn Lys Arg Trp Gin Leu Ala Leu Tyr Lys Gly 
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-continued 



?ro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly 
;ac tea ggt gtt aaa gac tat gtc gtt gta cat 



31u Val Tyr Tyr Gin Leu Glu S 



Val Lys Asp Pro P 



aga aacaggtacg ttaatagtta atagegtaot totttttctt 1048 
get agtcacacta gcoatcctta otgogott 1096 



<400> SEQUENCE: 10 

Met Asp Leu Phe Met Arg Phe Phe Thr Leu Gly Ser He Thr Ala Gin 
15 10 15 

•Pro val Lys -lie Asp Asn Ala Ser Pro Ala ser Thr Val His Ala Thr 
20 25 30 

Ala Thr He Pro Leu Gin Ala Ser Leu Pro Phe Gly Trp Leu Val He 
35 40 45 

Gly Val Ala Phe Leu Ala Val Phe Gin Ser Ala Thr Lys He He Ala 
50 55 60 

Leu Asn Lys Arg Trp Gin Leu Ala Leu Tyr Lys Gly Phe Gin Phe He 
65 70 75 80 

Cys Asn Leu Leu Leu Leu Phe Val Thr He Tyr Ser His Leu Leu Leu 
85 90 95 

Val Ala Ala Gly Met Glu Ala Gin Phe Leu Tyr Leu Tyr Ala Leu He 
100 105 110 

Tyr Phe Leu Gin Cys He Asn Ala Cys Arg He He Met Arg Cys Trp 
115 120 125 

Leu Cys Trp Lys Cys Lys Ser Lys Asn Pro Leu Leu Tyr Asp Ala Asn 
130 135 140 

Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr Cys He Pro Tyr 



Asp Thr lie Val Val Thr Glu Gly Asp Gly He S 
165 170 175 
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Thr Pro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly Tyr Ser Glu Asp 
180 185 190 

Arg His Ser Gly Val Lys Asp Tyr Val Val Val His Gly Tyr Phe Thr 
195 200 205 

Glu Val Tyr Tyr Gin Leu Glu Ser Thr Gin lie Thr Thr Asp Thr Gly 

He Glu Asn Ala Thr Phe Phe He Phe Asn Lys Leu Val Lys Asp Pro 
225 230 235 240 

Pro Asn Val Gin He His Thr He Asp Gly Ser Ser Gly Val Ala Asn 
245 250 255 

Pro Ala Met Asp Pro He Tyr Asp Glu Pro Thr Thr Thr Thr Ser Val 
260 265 270 

Pro Leu 

<210> SEQ ID NO 11 

<211> LENGTH: 1096 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<220> FEATURE: 

<221> NAME/KEY : CDS 

<222> LOCATION: ( 558 )..( 1019 ) 

<223> OTHER INFORMATION: 

<400> SEQUENCE: 11 

tcttgctttg ttgcatgact agttgttgca gttgcotcaa gggtgcatgo tcttgtggtt 60 

cttgctgoaa gtttgatgag gatgactctg agccagttct caagggtgtc aaattacatt 120 

acacataaac gaacttatgg atttgtttat gagatttttt actcttggat caattactgc 180 

acagccagta aaaattgaca atgcttctcc tgcaagtact gttcatgcta oagcaaogat 240 

accgctacaa gcctcactcc ctttcggatg gottgttatt ggcgttgcat ttcttgctgt 300 

gggcttccag ttoatttgca atttactgct gctatttgtt accatctatt cacatctttt 420 

gcttgtcgct gcaggtatgg aggcgcaatt tttgtacctc tatgccttga tatattttct 480 

acaatgcatc aacgcatgta gaattattat gagatgttgg ctttgttgga agtgcaaato 540 

caagaaccca ttacttt atg atg cca act act ttg ttt get ggc aca cac 590 



u Leu Gin Thr Leu Val Leu Lys Met L 
80 85 

c ttg tta aag acc cac cga atg tgc a 
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5 Arg Met Cys Lys Tyr Thr Gin 



taatagttaa tagcgtactt ctttttcttg ctttcgtggt attcttgcta gtcacactag 



<210> SEQ ID NO 12 

<211> LENGTH : 154 

<212> TYPE : PRT 

<213> ORGANISM: CORONAVIRUS 

<4 00> SEQUENCE: 12 

Met Met Pro Thr Thr Leu Phe Ala Gly Thr His He Thr Met Thr Thr 
15 10 15 

Val Tyr His He Thr Val Ser Gin He Gin Leu Ser Leu Leu Lys Val 
20 25 30 

Thr Ala Phe Gin His Gin Asn Ser Lys Lys Thr Thr Lys Leu Val Val 



He Leu Arg He Gly Thr Gin Val Leu Lys Thr Met Ser Leu Tyr Met 
50 55 60 

Ala He Ser Pro Lys Phe Thr Thr Ser Leu Ser Leu His Lys Leu Leu 
65 70 75 80 

Gin Thr Leu Val Leu Lys Met Leu His Ser Ser Ser Leu Thr Ser Leu 
85 90 95 

Leu Lys Thr His Arg Met Cys Lys Tyr Thr Gin Ser Thr Ala Leu Gin 
100 105 110 



Glu Leu Leu He Gin Gin Trp He Gin Phe Met Met Ser Arg Arg Arg 
115 120 125 

Leu Leu Ala Cys Leu Cys Lys His Lys Lys Val Ser Thr Asn Leu Cys 
130 135 140 

Thr His Ser Phe Arg Lys Lys Gin Val Arg 
145 150 



<221> NAME /KEY : CDS 
<222> LOCATION: (36).. (26 
<223> OTHER INFORMATION: 

<400> SEQUENCE: 13 

tgcctttgta agcacaagaa agt 



US 2007/0275002 Al 



78 



Nov. 29, 2007 



-continued 



ttc gtg gta ttc ttg eta gtc aca eta gec ate ctt act gcg ctt cga 
Phe Val Val Phe Leu Leu Val Thr Leu Ala He Leu Thr Ala Leu Arg 



t attattctgt 



ttggaacttt aacattgett atcatggcag aoaacggta 



<213> ORGANISM : CORONAVIRUS 
<4 00> SEQUENCE : 14 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu He Val Asn Ser 
15 10 15 

Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 
20 25 30 

He Leu Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn He Val Asn 
35 40 45 

Val Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 
50 55 60 

Leu Asn Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
65 70 75 

<210> SEQ ID NO 15 

<211> LENGTH: 332 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 



: 15 

tgcctttgta agcacaagaa agtgagtacg aacttatgta etcattegtt teggaagaaa 6 0 

caggtaegtt aatagttaat agegtactte tttttcttgc tttcgtggta ttcttgetag 12 0 

tcacactagc catccttact gegcttcgat tgtgtgcgta ctgctgcaat attgttaacg 180 

tgagtttagt aaaacoaaog gtttaegtet actcgcgtgt taaaaatctg aactcttctg 24 0 

aaggagttoo tgatcttctg gtctaaacga actaactatt attattattc tgtttggaao 300 

tttaacattg cttatcatgg cagacaaegg ta 332 




<220> FEATURE : 

<221> NAME/KEY: CDS 

<222> LOCATION: (41). .(703) 

<223> OTHER INFORMATION: 

<400> SEQUENCE: 16 

tattattatt attctgtttg gaactttaao attgettate atg gca gac aao ggt 55 
Met Ala Asp Asn Gly 
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-continued 



c Tyr Tyr Lys Leu Gly 



<400> SEQUENCE: 17 

Met Ala Asp Asn Gly Thr lie Thr Val Glu Glu Leu Lys Gin Leu Leu 
15 10 15 

Glu Gin Trp Asn Leu Val lie Gly Phe Leu Phe Leu Ala Trp lie Met 
20 25 30 

Leu Leu Gin Phe Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyr He He 
35 40 45 

Lys Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe 
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-continued 

50 55 60 

Val Leu Ala Ala Val Tyr Arg He Asn Trp Val Thr Gly Gly He Ala 

He Ala Met Ala Cys He Val Gly Leu Met Trp Leu Ser Tyr Phe Val 

Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn 
100 105 110 

Pro Glu Thr Asn He Leu Leu Asn Val Pro Leu Arg Gly Thr He Val 
115 120 125 

Thr Arg Pro Leu Met Glu Ser Glu Leu Val He Gly Ala Val He He 
130 135 140 

Arg Gly His Leu Arg Met Ala Gly Bis Ser Leu Gly Arg Cys Asp He 
145 150 155 160 

Lys Asp Leu Pro Lys Glu He Thr Val Ala Thr Ser Arg Thr Leu Ser 
165 170 175 

Tyr Tyr Lys Leu Gly Ala Ser Gin Arg Val Gly Thr Asp Ser Gly Phe 
180 185 190 

Ala Ala Tyr Asn Arg Tyr Arg He Gly Asn Tyr Lys Leu Asn Thr Asp 
195 200 205 

His Ala Gly Ser Asn Asp Asn He Ala Leu Leu Val Gin 
210 215 220 

<210> SEQ ID NO 18 



<400> SEQUENCE: 18 

cctgatotto tggtotaaac gaactaacta ttattattat tctgtttgga actttaacat 

tgcttatcat ggcagacaac ggtactatta ccgttgagga gcttaaacaa ctcctggaac 

aatggaacct agtaataggt ttootattoc tagcctggat tatgttacta caatttgcct 

attotaatog gaaoaggttt ttgtacataa taaagcttgt tttcctctgg ctcttgtggc 



cagtaacact tgcttgtttt gtgottgotg ctgtctacag aattaattgg gtgactggcg 
ggattgcgat tgoaatggct tgtattgtag gcttgatgtg gcttagctac ttcgttgctt 
ccttcaggct gtttgctcgt acccgctoaa tgtggtcatt caacccagaa acaaacattc 

tcattggtgc tgtgatcatt cgtggtoact tgogaatggc cggacactcc ctagggcgct 

gtgacattaa ggaoctgcca aaagagatoa ctgtggctao atoaogaaog otttcttatt 

aoaaattagg agcgtcgcag cgtgtaggoa ctgattcagg ttttgctgca tacaaccgct 

accgtattgg aaactataaa ttaaatacag accacgccgg tagcaacgac aatattgctt 

tgotagtaca gtaagtgaca acagatgttt catcttgttg acttccagg 

<210> SEQ ID NO 19 

<211> LENGTH: 1231 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 19 

tacogtattg gaaaotataa attaaataca gaocacgccg gtagcaacga oaatattgct 
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-continued 

ttgctagtac agtaagtgac aacagatgtt tcatcttgtt gacttcoagg ttacaatagc 

agagatattg attatcatta tgaggacttt caggattgct atttggaatc ttgacgttat 

aataagttca atagtgagao aattatttaa gcctctaact aagaagaatt attcggagtt 

agatgatgaa gaacctatgg agttagatta tccataaaac gaacatgaaa attattctct 

tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag 

gtacgactgt actactaaaa gaaccttgcc catcaggaac atacgagggc aattcaccat 

ttcaccctct tgotgacaat aaatttgcac taacttgcao tagcacacac tttgcttttg 

ctgctctagt atttttaata ctttgcttca ccattaagag aaagacagaa tgaatgagct 

cactttaatt gacttctatt tgtgottttt agoctttctg ctattccttg ttttaataat 

gcttattata ttttggtttt oaotogaaat ccaggatcta gaagaacctt gtaccaaagt 

ctaaacgaac atgaaacttc tcattgtttt gacttgtatt tctctatgca gttgcatatg 

cactgtagta cagcgctgtg catctaataa acctcatgtg cttgaagatc cttgtaaggt 

acaacactag gggtaatact tatagcactg cttggctttg tgctctagga aaggttttac 

ottttcatag atggcacact atggttcaaa catgcacacc taatgttaot atcaaotgto 

aagatccagc tggtggtgcg cttatagcta ggtgttggta oottoatgaa ggtcaocaaa 

ctgctgcatt tagagacgta cttgttgttt taaataaacg aacaaattaa aatgtctgat 

aatggaococ aatcaaacca acgtagtgcc ccccgcatta catttggtgg acccacagat 

tcaactgaca ataaccagaa tggaggacgc a 



gcatacaacc gctaccgtat tggaaactat aaattaaata cagaccaogc cggtagcaao 
gacaatattg ctttgctagt acagtaagtg acaacagatg tttcatcttg ttgacttcca 
ggttacaata gcagagatat tgattatcat tatgaggact ttcaggattg ctatttggaa 

aaattattot cttcctgaca ttgattgtat ttacatcttg ogagctatat cactatcagg 
agtgtgttag aggtacgact gtactactaa aagaaccttg cccatcagga acatacgagg 
gcaattcacc atttcaccct cttgctgaca ataaatttgc actaacttgc actagcacac 
actttgcttt tgcttgtgct gacggtactc gacataccta tcagctgcgt gcaagatcag 
tttcaooaaa acttttcatc agacaagagg aggttcaaca agagctctac tcgccacttt 
ttctcattgt tgctgctcta gtatttttaa taotttgctt caccattaag agaaagacag 
aatgaatgag ctcactttaa ttgacttcta tttgtgcttt ttagcctttc tgctattcct 
tgttttaata atgcttatta tattttggtt ttcactcgaa atccaggatc tagaagaacc 
ttgtaooaaa gtctaaacga acatgaaact tctcattgtt ttgacttgta tttctctatg 
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cagttgcata tgcactgtag tacagcgotg tgcatctaat aaacctcatg tgcttgaaga 900 
tcottgtaag gtacaacact aggggtaata cttatagcac tgcttggctt tgtgctctag 960 

aaggtcacca aactgctgca tttagagacg tacttgttgt tttaaataaa ogaacgaatt 1140 
aaaatgtctg ataatggacc ccaatcaaac caacgtagtg ccccccgcat tacatttggt 1200 
ggacccacag attcaaotga oaataaccag aatggaggac go 1242 

<210> SEQ ID NO 21 

<211> LENGTH: 1231 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<220> FEATURE: 

<221> NAME /KEY : CDS 

<222> LOCATION: (86). .(274) 

<223> OTHER INFORMATION: 

<400> SEQUENCE: 21 

taccgtattg gaaactataa attaaataca gaccacgocg gtagcaacga caatattgct 60 

ttgctagtac agtaagtgac aacag atg ttt cat ctt gtt gac ttc cag gtt 112 
Met Phe His Leu Val Asp Phe Gin Val 



lie Trp Asn Leu Asp Val lie He Ser Ser He Val Arg Gin Leu Phe 



gacattgatt gtatttacat cttgcgagct atatcactat caggagtgtg ttagaggtac 

gactgtacta ctaaaagaac cttgcccatc aggaacatac gagggoaatt caccatttca 

ccctcttgct gacaataaat ttgcactaac ttgcactagc acacactttg cttttgcttg 

tgctgacggt actcgacata cctatcagct gcgtgcaaga tcagtttcac caaaactttt 

catoagaoaa gaggaggttc aacaagagct ctactcgcca ctttttctca ttgttgctgc 

tctagtattt ttaatacttt gottoaccat taagagaaag acagaatgaa tgagctcact 

ttaattgact tctatttgtg ctttttagcc tttctgctat tccttgtttt aataatgctt 

aogaacatga aacttctcat tgttttgact tgtatttctc tatgcagttg catatgcact 

gtagtacagc gctgtgoato taataaacct oatgtgottg aagatcottg taaggtacaa 

oactaggggt aataottata gcactgettg gctttgtgct ctaggaaagg ttttaccttt 

tcatagatgg cacactatgg ttcaaacatg cacacctaat gttactatca actgtcaaga 

tccagctggt ggtgcgctta tagctaggtg ttggtacctt catgaaggtc accaaactgc 

tgcatttaga gacgtacttg ttgttttaaa taaacgaaoa aattaaaatg tctgataatg 
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aaaccaacgt agtgcccccc gcattacatt tggtggaccc acagattcaa 1204 
ccagaatgga ggacgoa 1231 



<210> SEQ ID NO 22 

<211> LENGTH: 63 

<212> TYPE: PRT 

<213> ORGANISM: CORONAVIRUS 



Met Phe His Leu Val Asp Phe Gin Val Thr He Ala Glu He Leu He 
15 10 15 

He He Met Arg Thr Phe Arg He Ala He Trp Asn Leu Asp Val He 

He Ser Ser He Val Arg Gin Leu Phe Lys Pro Leu Thr Lys Lys Asn 
35 40 45 

Tyr Ser Glu Leu Asp Asp Glu Glu Pro Met Glu Leu Asp Tyr Pro 



<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<220> FEATURE : 

<221> NAME/KEY: CDS 

<222> LOCATION: (285).. (650) 

<223> OTHER INFORMATION: 

<400> SEQUENCE: 23 

taoogtattg gaaactataa attaaataca gaccacgccg gtagoaacga oaatattgot 
ttgotagtac agtaagtgac aaoagatgtt toatottgtt gaottooagg ttacaatago 

aataagttca atagtgagac aattatttaa gcctctaact aagaagaatt attcggagtt 
agatgatgaa gaacctatgg agttagatta tccataaaac gaac atg aaa att att 



r Gly Thr Tyr Glu Gly Asn 



Lys Phe Ala L 
55 



p Gly Thr Arg His Thr Tyr Gin Leu Arg Ala Arg Ser Val Ser Pro 



e Leu He Val Ala Ala Leu V 
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a gaa tgaatgagct cactttaatt gacttctatt 



lie Lys Arg Lys 
tgtgcttttt agcct 



tcattgtttt gaottgtatt tctctatgca gttgcatatg cactgtagta cagcgctgtg 860 

catctaataa acctcatgtg cttgaagatc cttgtaaggt aoaacactag gggtaatact 920 

tatagcactg cttggctttg tgctctagga aaggttttac cttttcatag atggcaoact 980 

atggttcaaa catgcacacc taatgttact atcaaotgtc aagatccagc tggtggtgcg 1040 

cttatagcta ggtgttggta ccttcatgaa ggtcaccaaa ctgctgcatt tagagacgta 1100 

cttgttgttt taaataaacg aacaaattaa aatgtctgat aatggacccc aatcaaacca 1160 

acgtagtgcc ccccgcatta catttggtgg acccacagat toaactgaca ataaccagaa 1220 

tggaggaogc a 1231 



<210> SEQ r 



Met Lys lie lie Leu Phe Leu Thr Leu lie Val Phe Thr Ser Cys Glu 
15 10 15 

Leu Tyr His Tyr Gin Glu Cys Val Arg Gly Thr Thr Val Leu Leu Lys 
20 25 30 

Glu Pro Cys Pro Ser Gly Thr Tyr Glu Gly Asn Ser Pro Phe His Pro 
35 40 45 

Leu Ala Asp Asn Lys Phe Ala Leu Thr Cys Thr Ser Thr His Phe Ala 
50 55 60 

Phe Ala Cys Ala Asp Gly Thr Arg His Thr Tyr Gin Leu Arg Ala Arg 
65 70 75 80 

Ser Val Ser Pro Lys Leu Phe lie Arg Gin Glu Glu Val Gin Gin Glu 
85 90 95 

Leu Tyr Ser Pro Leu Phe Leu lie Val Ala Ala Leu Val Phe Leu lie 



Leu Cys Phe Thr lie Lys Arg Lys Thr Glu 
115 120 



: CORONAVIRUS 

: (650). .(781) 
<223> OTHER INFORMATION: 

<400> SEQUENCE : 25 

taoogtattg gaaactataa attaaataca gaocaogccg gtagoaacga caatattgct 
ttgctagtac agtaagtgac aacagatgtt tcatcttgtt gacttccagg ttacaatagc 
agagatattg attatcatta tgaggacttt caggattgct atttggaatc ttgacgttat 
aataagttca atagtgagac aattatttaa gcctotaact aagaagaatt attcggagtt 
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agatgatgaa gaacctatgg agttagatta tccataaaac gaacatgaaa attattetet 
tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag 
gtacgactgt actactaaaa gaaccttgcc catcaggaac atacgagggc aattcaccat 

cttgtgctga cggtactcga catacctatc agctgcgtgc aagatcagtt tcaccaaaac 
ttttcatcag acaagaggag gttcaacaag agctctactc gccaottttt ctcattgttg 
ctgctctagt atttttaata ctttgcttca ccattaagag aaagacaga atg aat gag 



aacgaaca tgaaacttct 

cattgttttg aottgtattt ctotatgcag ttgoatatgo actgtagtao agcgctgtgc 

atagoaotgo ttggotttgt gctotaggaa aggttttaco ttttoataga tggcaoaota 

tggttoaaac atgoacacct aatgttacta tcaactgtca agatccagct ggtggtgcgo 

ttatagotag gtgttggtac ottcatgaag gtcaccaaac tgctgcattt agagacgtac 

ttgttgtttt aaataaacga acaaattaaa atgtotgata atggacccca atcaaaccaa 

cgtagtgccc cccgcattac atttggtgga cccacagatt oaactgacaa taaccagaat 



<213> ORGANISM: 



Met Asn Glu Leu Thr Leu lie Asp Phe Tyr Leu Cys Phe Leu Ala Phe 
15 10 15 

Leu Leu Phe Leu Val Leu lie Met Leu He He Phe Trp Phe Ser Leu 
20 25 30 



<210> SEQ ID NO 27 

<211> LENGTH: 1231 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<220> FEATURE : 




<400> SEQUENCE : 27 

taccgtattg gaaactataa attaaataoa gaccacgccg gtagoaacga caatattgct 60 
ttgctagtac agtaagtgac aacagatgtt tcatcttgtt gacttccagg ttacaatagc 120 
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agagatattg attatcatta tgaggacttt caggattgct atttggaatc ttgacgttat 
aataagttca atagtgagac aattatttaa gcctctaact aagaagaatt attcggagtt 
agatgatgaa gaaoctatgg agttagatta tccataaaac gaacatgaaa attattetct 
tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag 
gtacgactgt actactaaaa gaaccttgcc catcaggaac atacgagggc aattoaccat 
ttoaccctct tgctgacaat aaatttgcac taacttgcao tagcacacac tttgcttttg 
cttgtgctga cggtactcga catacctatc agctgcgtgc aagatoagtt tcaccaaaac 
ttttcatcag acaagaggag gttcaacaag agctctactc gccacttttt ctcattgttg 
ctgctctagt atttttaata ctttgcttea ccattaagag aaagaoagaa tgaatgagct 
cactttaatt gacttctatt tgtgcttttt agoctttotg ctattccttg ttttaataat 
gottattata ttttggtttt oactcgaaat ccaggatcta gaagaaoctt gtaooaaagt 



Val L 

ctgcttggct ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggtto 987 

aaacatgoac acctaatgtt actatcaact gtcaagatoo agctggtggt gcgcttatag 1047 

ctaggtgttg gtaccttcat gaaggtcaco aaactgctgc atttagagac g-tacttgttg 1107 

jaacaaat taaaatgtct gataatggao cccaatcaaa ccaacgtagt 1167 

icatttgg tggacccaca gattcaactg acaataacca gaatggagga 1227 



<213> ORGANISM: CORONAVIRUS 
<400> SEQUENCE: 28 

Met Lys Leu Leu He Val Leu Thr Cys He Ser Leu Cys S 



Cys Thr Val Val Gin Arg Cys Ala Ser Asn Lys Pro H 
20 25 

Asp Pro Cys Lys Val Gin His 



<400> SEQUENCE: 29 
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-continued 

taccgtattg gaaactataa attaaataca gaccacgccg gtagcaacga caatattgct 60 

ttgctagtac agtaagtgac aacagatgtt tcatcttgtt gacttccagg ttacaatagc 120 

agagatattg attatcatta tgaggacttt caggattgct atttggaatc ttgacgttat 180 

agatgatgaa gaacctatgg agttagatta tccataaaac gaacatgaaa attattctct 300 

tcctgacatt gattgtattt acatcttgcg agctatatca ctatcaggag tgtgttagag 360 

gtacgactgt actaotaaaa gaaccttgcc catcaggaac atacgagggc aattoaooat 420 

ttcaccctct tgctgacaat aaatttgcac taacttgcac tagcacaoac tttgcttttg 480 

ttttcatcag acaagaggag gttcaacaag agctctactc gccacttttt ctcattgttg 600 

ctgctctagt atttttaata otttgcttca ccattaagag aaagaoagaa tgaatgagct 660 

cactttaatt gacttctatt tgtgottttt agcctttctg ctattcottg ttttaataat 720 

gcttattata ttttggtttt cactcgaaat ccaggatcta gaagaacctt gtaccaaagt 780 

ctaaacgaao atgaaacttc tcattgtttt gacttgtatt tctctatgca gttgcatatg 840 

cactgtagta oagcgctgtg oatctaataa acctc atg tgc ttg aag ate ctt 893 



taaaatgtct gataatggac cccaatcaaa ecaaegtagt gec 
tggacccaca gattcaactg acaataacca gaatggagga cgc 

<210> SEQ ID NO 30 
<211> LENGTH: 84 
<212> TYPE : PRT 



<4 00> SEQUENCE: 30 

Met Cys Leu Lys lie Leu Val Arg Tyr Asn Thr Arg Gly Asn Thr Tyr 
15 10 15 

Ser Thr Ala Trp Leu Cys Ala Leu Gly Lys Val Leu Pro Phe His Arg 
20 25 30 

Trp His Thr Met Val Gin Thr Cys Thr Pro Asn Val Thr lie Asn Cys 



n Asp Pro Ala Gly Gly Ala Leu He Ala Arg Cys Trp Tyr Leu H 
50 55 60 
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Glu Gly His Gin Thr Ala Ala Phe Arg Asp Val Leu Val V 
65 70 75 

Lys Arg Thr Asn 



<400> SEQUENCE: 31 

atggagagcc ttgttcttgg tgtcaacgag aaaacacacg tccaactcag tttgoctgtc 
cttcaggtta gagacgtgct agtgcgtggc ttcggggact ctgtggaaga ggccctatcg 
gaggcacgtg aacacctcaa aaatggcact tgtggtctag tagagctgga aaaaggcgta 
ctgccccagc ttgaacagcc ctatgtgttc attaaacgtt ctgatgcctt aagcaccaat 
cacggccaca aggtcgttga gctggttgca gaaatggacg gcattcagta cggtcgtagc 
ggtataacac tgggagtaot cgtgccacat gtgggcgaaa ccccaattgc ataccgcaat 
gttcttcttc gtaagaacgg taataaggga gccggtggtc atagctatgg catcgatcta 
aagtcttatg acttaggtga cgagottggc actgatocoa ttgaagatta tgaacaaaac 
tggaacaota agoatggoag tggtgcactc ogtgaaotca ctcgtgagct caatggaggt 
gcagtcactc gctatgtcga oaacaatttc tgtggcccag atgggtaoco tottgattgc 

taoatcgagt cgaagagagg tgtctactgc tgccgtgacc atgagoatga aattgcctgg 
ttcactgagc gctctgataa gagctacgag caccagacac ccttcgaaat taagagtgcc 
aagaaatttg acactttcaa aggggaatgc ccaaagtttg tgtttcctct taactcaaaa 
gtcaaagtca ttcaaccacg tgttgaaaag aaaaagaotg agggtttoat ggggcgtata 
cgctctgtgt accctgttgc atctccacag gagtgtaaca atatgcactt gtctaccttg 
atgaaatgta atcattgcga tgaagtttca tggcagacgt gcgactttct gaaagccact 
tgtgaacatt gtggcactga aaatttagtt attgaaggac ctactacatg tgggtaccta 
cctactaatg ctgtagtgaa aatgccatgt cctgcctgtc aagacccaga gattggacct 
gagcatagtg ttgcagatta tcacaaccac tcaaacattg aaactcgact ccgcaaggga 
ggtaggacta gatgttttgg aggctgtgtg tttgcctatg ttggctgcta taataagcgt 
gcotactggg ttcctcgtgc tagtgctgat attggctcag gccatactgg cattactggt 
gacaatgtgg agaoottgaa tgaggatctc cttgagatao tgagtcgtga acgtgttaac 
attaacattg ttggogattt tcatttgaat gaagaggttg ccatcatttt ggcatctttc 
totgcttcta caagtgcctt tattgacact ataaagagtc ttgattacaa gtctttcaaa 
accattgttg agtoctgcgg taactataaa gttaccaagg gaaagcccgt aaaaggtgct 
tggaacattg gacaacagag atcagtttta acaccactgt gtggttttcc ctcacaggct 
gctggtgtta tcagatcaat ttttgcgcgo acacttgatg cagcaaacca ctcaattcct 
gatttgcaaa gagcagctgt caccatactt gatggtattt ctgaacagtc attacgtctt 
gtcgaogoca tggtttatac ttcagacctg ctcaccaaca gtgtcattat tatggcatat 
gtaactggtg gtcttgtaca acagacttot cagtggttgt otaatctttt gggcactact 
gttgaaaaac tcaggcctat ctttgaatgg attgaggcga aacttagtgc aggagttgaa 
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-continued 

tttctcaagg atgottggga gattctcaaa tttctcatta caggtgtttt tgacatcgtc 1980 

aagggtcaaa tacaggttgc ttcagataac atcaaggatt gtgtaaaatg cttcattgat 2040 

gttgttaaca aggcactcga aatgtgcatt gatcaagtca ctatcgctgg cgcaaagttg 2100 

cgatcactca acttaggtga agtcttcatc gctcaaagca agggacttta ccgtcagtgt 2160 

atacgtggca aggagoagct gcaactactc atgcctctta aggcaccaaa agaagtaacc 2220 

tttcttgaag gtgattcaca tgacacagta cttacctctg aggaggttgt tctcaagaac 22B0 

ggtgaactcg aagcactcga gacgcccgtt gatagcttca caaatggagc tatcgttggc 2340 

acaccagtct gtgtaaatgg cctcatgctc ttagagatta aggacaaaga acaatactgc 2400 

gcattgtctc ctggtttact ggctacaaac aatgtctttc gcttaaaagg gggtgcacca 2460 

attaaaggtg taaoctttgg agaagatact gtttgggaag ttcaaggtta caagaatgtg 2520 

agaatcacat ttgagcttga tgaacgtgtt gacaaagtgo ttaatgaaaa gtgctctgtc 25B0 

tacactgttg aatccggtac cgaagttact gagtttgcat gtgttgtagc agaggctgtt 2640 

gtgaagactt tacaaocagt ttotgatctc cttaccaaca tgggtattga tcttgatgag 2700 

tggagtgtag ctaoattcta cttatttgat gatgctggtg aagaaaactt ttcatcacgt 2760 

atgtattgtt ccttttaccc tccagatgag gaagaagagg acgatgoaga gtgtgaggaa 2820 

gaagaaattg atgaaacctg tgaaoatgag taoggtacag aggatgatta tcaaggtctc 2880 

octotggaat ttggtgcctc agctgaaaca gttcgagttg aggaagaaga agaggaagac 2940 

tggctggatg atactaetga gcaatcagag attgagccag aaocagaacc tacacctgaa 3000 

gaaooagtta atcagtttao tggttattta aaacttactg acaatgttgc cattaaatgt 3060 

gttgacatcg ttaaggaggc acaaagtgct aatootatgg tgattgtaaa tgctgctaac 3120 

ataoacotga aacatggtgg tggtgtagca ggtgoactca acaaggoaac caatggtgcc 3180 

atgcaaaagg agagtgatga ttacattaag ctaaatggcc ctcttacagt aggagggtct 3240 

tgtttgottt ctggacataa tcttgctaag aagtgtctgc atgttgttgg acctaaccta 3300 

aatgoaggtg aggaoatcca gcttottaag gcagcatatg aaaatttcaa ttcacaggac 3360 

atcttacttg caocattgtt gtcagcaggc atatttggtg ctaaaccact tcagtcttta 3420 

caagtgtgcg tgcagacggt tcgtacacag gtttatattg cagtcaatga caaagctctt 3480 

tatgagcagg ttgtcatgga ttatcttgat aacctgaagc ctagagtgga agcacctaaa 3540 

caagaggagc caccaaacac agaagattcc aaaactgagg agaaatctgt cgtacagaag 3600 

gaaactaagt ttcttaccaa taagttactc ttgtttgctg atatcaatgg taagctttac 3720 

catgattctc agaacatgct tagaggtgaa gatatgtctt tccttgagaa ggatgcacct 3 7 BO 

tacatggtag gtgatgttat cactagtggt gatatcactt gtgttgtaat accctccaaa 3840 

aaggctggtg gcactaotga gatgctctca agagctttga agaaagtgcc agttgatgag 3900 

tatataacoa cgtaccctgg acaaggatgt gctggttata cacttgagga agctaagact 3960 

gctottaaga aatgcaaatc tgcattttat gtactacctt cagaagcacc taatgctaag 4020 

gaagagattc taggaactgt atcatggaat ttgagagaaa tgcttgctca tgctgaagag 4080 

acaagaaaat taatgcotat atgoatggat gttagagcca taatggoaac catccaacgt 4140 

aagtataaag gaattaaaat tcaagagggc atogttgact atggtgtccg attcttcttt 4200 
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-continued 

cttgtcacaa tgccaattgg ttatgtgaca catggtttta atottgaaga ggctgcgcgc 4320 

tgtatgcgtt otottaaagc tcctgccgta gtgtcagtat catcaccaga tgctgttact 4380 

acatataatg gatacctcac ttcgtcatca aagacatctg aggagcactt tgtagaaaca 4440 

gtttctttgg ctggctctta cagagattgg tcctattcag gacagogtac agagttaggt 4500 

gttgaatttc ttaagcgtgg tgacaaaatt gtgtaccaca ctctggagag ooocgtcgag 4560 

tttcatcttg acggtgaggt tctttcactt gaoaaactaa agagtctctt atccctgcgg 4620 

gaggttaaga ctataaaagt gttcacaact gtggacaaca ctaatctcca cacacagctt 4680 

gtggatatgt ctatgacata tggacagcag tttggtccaa catacttgga tggtgctgat 4740 

gttacaaaaa ttaaacctca tgtaaatcat gagggtaaga ctttctttgt actacctagt 4800 

gatgaoacac taogtagtga agctttcgag tactaccata ctcttgatga gagttttctt 4860 

ggtaggtaca tgtctgcttt aaaccacaca aagaaatgga aatttcctca agttggtggt 4920 

ttaacttcaa ttaaatgggc tgataacaat tgttatttgt ctagtgtttt attagcactt 4980 

caacagcttg aagtcaaatt oaatgcacca gcacttcaag aggcttatta tagagcccgt 5040 

gctggtgatg ctgotaactt ttgtgoaoto atactcgctt aoagtaataa aaotgttggc 5100 

gagcttggtg atgtcagaga aactatgaoc catcttctac agcatgctaa tttggaatot 5160 

ggtgtagaag ctgtgatgta tatgggtact ctatcttatg ataatottaa gacaggtgtt 5280 

tccattccat gtgtgtgtgg tcgtgatgct acacaatatc tagtacaaca agagtottct 5340 

tttgttatga tgtotgoaco acctgctgag tataaattao agcaaggtac attcttatgt 5400 

gcgaatgagt acactggtaa ctatcagtgt ggtoattaca otcatataac tgctaaggag 5460 

accctctatc gtattgacgg agctcacctt acaaagatgt cagagtacaa aggaccagtg 5520 

actgatgttt tctacaagga aacatcttac actacaacca tcaagcctgt gtcgtataaa 5580 

ctcgatggag ttacttaoao agagattgaa ccaaaattgg atgggtatta taaaaaggat 5640 

aatgcttact atacagagca gcotatagac cttgtaccaa ctcaacoatt accaaatgcg 5700 

agttttgata atttcaaact cacatgttct aacacaaaat ttgctgatga tttaaatcaa 5760 

atgacaggct tcacaaagcc agcttcacga gagctatctg tcacattctt cccagacttg 5820 

aatggcgatg tagtggctat tgactataga cactattcag cgagtttcaa gaaaggtgct 5880 

aaattactgc ataagccaat tgtttggcao attaaccagg ctacaaccaa gacaacgttc 5940 

aaaccaaaca ottggtgttt acgttgtctt tggagtacaa agccagtaga tacttcaaat 6000 

tcatttgaag ttctggcagt agaagacaca caaggaatgg acaatcttgc ttgtgaaagt 6060 

caacaaccca cctctgaaga agtagtggaa aatcctacca tacagaagga agtcatagag 6120 

tgtgacgtga aaaotaccga agttgtaggo aatgtoatac -ttaaaccatc agatgaaggt 6180 

gttaaagtaa cacaagagtt aggtcatgag gatcttatgg ctgottatgt ggaaaacaca 6240 

agcattacca ttaagaaacc taatgagctt tcactagcot taggtttaaa aacaattgcc 6300 

actcatggta ttgctgcaat taatagtgtt ccttggagta aaattttggc ttatgtcaaa 6360 

ccattcttag gaoaagcagc aattacaaca tcaaattgcg ctaagagatt agcacaacgt 6420 

gtgtttaaca attatatgco ttatgtgttt acattattgt tccaattgtg tacttttact 6480 
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-continued 

aagagtgttg ctaaattatg tttggatgcc ggcattaatt atgtgaagtc accoaaattt 6600 

tctaaattgt tcacaatcgc tatgtggcta ttgttgttaa gtatttgctt aggttctcta 6660 

atctgtgtaa ctgctgcttt tggtgtactc ttatotaatt ttggtgctcc ttcttattgt 6720 

aatggcgtta gagaattgta tcttaattcg tctaacgtta ctactatgga tttctgtgaa 6780 

cttgaaacca ttcaggtgao gatttcatcg tacaagctag acttgacaat tttaggtctg 6900 

gccgctgagt gggttttggc atatatgttg ttcacaaaat tcttttattt attaggtctt 6960 

tcagctataa tgcaggtgtt ctttggctat tttgotagtc atttcatcag caattcttgg 7020 

ctcatgtggt ttatcattag tattgtacaa atggcacccg tttctgcaat ggttaggatg 7080 

tacatcttct ttgcttcttt ctactacata tggaagagct atgttcatat catggatggt 7140 

tgcacctctt cgacttgcat gatgtgctat aagcgcaatc gtgccacacg cgttgagtgt 7200 

acaactattg ttaatggoat gaagagatct ttctatgtct atgcaaatgg aggccgtggc 7260 

ttctgcaaga ctcacaattg gaattgtctc aattgtgaoa cattttgcac tggtagtaca 7320 

ttcattagtg atgaagttgo tcgtgatttg toactcoagt ttaaaagaoc aatcaaccct 7380 

actgaccagt catogtatat tgttgatagt gttgctgtga aaaatggcgo gcttcaocto 7440 

gatggcaagt ccaaatgcga cgagtctgct tctaagtctg cttctgtgta ctacagtcag 7620 

ctgatgtgcc aacctattct gttgottgao caagctcttg tatcagacgt tggagatagt 7680 

actgaagttt ccgttaagat gtttgatgct tatgtcgaca ccttttcagc aacttttagt 774 0 

gttcctatgg aaaaacttaa ggcacttgtt gctaoagctc acagcgagtt agcaaagggt 7800 

gtagctttag atggtgtcct ttctacattc gtgtcagctg cccgacaagg tgttgttgat 7860 

accgatgttg acacaaagga tgttattgaa tgtctcaaac tttcacatca ctctgactta 7920 

gaagtgacag gtgacagttg taaoaattto atgctcacct ataataaggt tgaaaacatg 7980 

acgcccagag atcttggcgc atgtattgac tgtaatgcaa ggcatatcaa tgcccaagta 8040 

gcaaaaagtc acaatgtttc actcatctgg aatgtaaaag actacatgtc tttatctgaa 8100 

oagctgogta aacaaattcg tagtgctgcc aagaagaaca acataccttt tagactaact 8160 

tgtgctacaa ctagacaggt tgtcaatgtc ataactacta aaatctcaot caagggtggt 8220 

aagattgtta gtacttgttt taaacttatg cttaaggcca cattattgtg cgttcttgct 8280 

gcattggttt gttatatcgt tatgccagta catacattgt caatccatga tggttacaca 8340 

aatgaaatca ttggttacaa agccattcag gatggtgtoa otcgtgacat catttctact 8400 

gatgattgtt ttgcaaataa acatgctggt tttgacgcat ggtttagcoa gcgtggtggt 8460 

tcatacaaaa atgaoaaaag ctgccctgta gtagotgcta toattaoaag agagattggt 8S20 

ttcatagtgc ctggcttacc gggtactgtg ctgagagcaa tcaatggtga cttcttgcat 8580 

tttctacctc gtgtttttag tgctgttggc aacatttgct acacaccttc caaactcatt 8640 

gagtatagtg attttgctao ctctgottgc gttcttgctg ctgagtgtac aatttttaag 8700 

gatgotatgg gcaaacctgt gcoatattgt tatgacacta atttgctaga gggttotatt 8760 
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-continued 

tttcctaaca ctfcacctgga gggttctgtt agagtagtaa caacttttga tgctgagtac 8880 

tgtagacatg gtacatgcga aaggtcagaa gtaggtattt gcctatctao cagtggtaga 8940 

tgggttotta ataatgagoa ttacagagct ctatcaggag ttttctgtgg tgttgatgcg 9000 

atgaatctca tagctaacat ctttactcct cttgtgcaac ctgtgggtgc tttagatgtg 9060 

totgcttcag tagtggctgg tggtattatt gccatattgg tgacttgtgc tgcctaotac 9120 

tttatgaaat tcagacgtgt ttttggtgag tacaaocatg ttgttgctgc taatgcactt 9180 

ttgtttttga tgtctttcac tatactctgt ctggtaccag cttacagctt tctgccggga 9240 

gtctactcag tcttttactt gtacttgaca ttctatttca ccaatgatgt ttcattcttg 9300 

gctoaccttc aatggtttgc catgttttct cctattgtgc ctttttggat aacagoaatc 9360 

tatgtattct gtatttctct gaagcactgc oattggttct ttaacaacta tcttaggaaa 9420 

agagtcatgt ftaatggagt tacatttagt accttcgagg aggctgcttt gtgtaccttt 9480 

ttgctcaaca aggaaatgta cctaaaattg cgtagcgaga caotgttgcc acttacacag 9540 

tataacaggt atcttgctct atataacaag tacaagtatt tcagtggagc cttagatact 9600 

aocagctato gtgaagcago ttgctgccac ttagoaaagg ototaaatga ctttagoaac 9660 

toaggtgotg atgttotota ccaaccacca cagacatcaa tcacttctgc tgttctgcag 9720 

agtggtttta ggaaaatggc attcccgtca ggcaaagttg aagggtgcat ggtacaagta 9780 

acctgtggaa ctacaactct taatggattg tggttggatg acacagtata ctgtccaaga 9840 

oatgtoattt gcacagcaga agaoatgott aatcctaact atgaagatct gctcattcgc 9900 

aaatccaacc atagctttct tgttcaggct ggcaatgttc aacttcgtgt tattggccat 9960 

tctatgcaaa attgtctgct taggcttaaa gttgatactt ctaaccctaa gacacccaag 10020 

tataaatttg tccgtatcca acctggtcaa acattttcag ttctagcatg ctacaatggt 10080 

tcaccatctg gtgtttatca gtgtgccatg agacctaatc ataccattaa aggttotttc 10140 

cttaatggat oatgtggtag tgttggtttt aacattgatt atgattgcgt gtctttctgc 10200 

tatatgoato atatggagot tcoaacagga gtacacgctg gtaotgaott agaaggtaaa 10260 

ttctatggtc catttgttga cagacaaact gcacaggctg caggtacaga cacaaccata 10320 

acattaaatg ttttggcatg gctgtatgct gctgttatca atggtgatag gtggtttctt 10380 

aatagattca ccactacttt gaatgacttt aaccttgtgg caatgaagta caactatgaa 10440 

cctttgacac aagatcatgt tgacatattg ggacctcttt ctgctcaaac aggaattgcc 10500 

gtcttagata tgtgtgctgc tttgaaagag otgotgcaga atggtatgaa tggtcgtact 10560 

atccttggta gcactatttt agaagatgag tttacaccat ttgatgttgt tagacaatgc 10620 

tctggtgtta ccttocaagg taagttcaag aaaattgtta agggcactca tcattggatg 10680 

cttttaactt tcttgaoatc aotattgatt cttgttcaaa gtacacagtg gtcactgttt 10740 

ttctttgttt acgagaatgc tttcttgooa tttactcttg gtattatggc aattgctgca 10800 

tgtgctatgc tgcttgttaa gcataagcac gcattcttgt gcttgtttct gttaccttct 10860 

cttgcaacag ttgcttactt taatatggtc tacatgoctg ctagctgggt gatgcgtatc 10920 

atgaoatggo ttgaattggc tgaoactagc ttgtctggtt ataggottaa ggattgtgtt 10980 

atgtatgctt cagctttagt tttgcttatt atcatgaoag ctcgcactgt ttatgatgat 11040 
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ggtaatgctt tagatcaago tatttccatg tgggccttag ttatttctgt aacctctaac 11160 

tattctggtg tcgttacgac tatcatgttt ttagctagag ctatagtgtt tgtgtgtgtt 11220 

gagtattacc cattgttatt tattactggc aacaccttac agtgtatcat gcttgtttat 11280 

tgtttcttag gotattgttg ctgctgctac tttggccttt tctgtttact caaccgttac 11340 

ttcaggctta otcttggtgt ttatgactac ttggtctcta oaoaagaatt taggtatatg 11400 

aactcccagg ggottttgcc tcctaagagt agtattgatg ctttoaagct taacattaag 11460 

ttgttgggta ttggaggtaa accatgtatc aaggttgcta ctgtacagtc taaaatgtct 11520 

gacgtaaagt gcacatctgt ggtactgctc tcggttcttc aacaacttag agtagagtca 11580 

tcttctaaat tgtgggcaca atgtgtacaa ctccacaatg atattottct tgcaaaagac 11640 

acaactgaag ctttcgagaa gatggtttct cttttgtctg ttttgctatc catgcagggt 11700 

gctgtagaca ttaataggtt gtgcgaggaa atgctcgata accgtgctac tcttcaggct 11760 

attgcttcag aatttagttc tttaccatca tatgccgctt atgccactgc ccaggaggcc 11820 

tatgagcagg ctgtagctaa tggtgattct gaagtcgttc toaaaaagtt aaagaaatot 11880 

ttgaatgtgg ctaaatotga gtttgaoogt gatgotgooa tgcaaogcaa gttggaaaag 11940 

atggoagato aggctatgac ccaaatgtac aaacaggoaa gatctgagga caagagggoa 12000 

aaagtaacta gtgctatgca aacaatgctc ttcactatgc ttaggaagct tgataatgat 12060 

ttgactacag cagccaaact catggttgtt gtccctgatt atggtaccta caagaacact 12180 

tgtgatggta acacctttac atatgcatct gcactctggg aaatooagca agttgttgat 1224 0 

goggatagca agattgttca acttagtgaa attaacatgg acaattcacc aaatttggot 12300 

tggcctctta ttgttacagc totaagagco aactcagctg ttaaactaca gaataatgaa 12360 

ctgagtccag tagcactacg acagatgtcc tgtgcggctg gtaccacaca aacagcttgt 12420 

actgatgaca atgcacttgc ctactataac aattcgaagg gaggtaggtt tgtgctggca 12480 

acaatttaca cagaactgga accaccttgt aggtttgtta cagacacacc aaaagggcct 12600 

aaagtgaaat acttgtactt catcaaaggc ttaaacaacc taaatagagg tatggtgctg 12660 

ggcagtttag ctgctacagt acgtcttcag gctggaaatg ctacagaagt acctgccaat 12720 

tcaactgtgc tttccttctg tgcttttgca gtagaccctg ctaaagcata taagga-ttac 12780 

ctagcaagtg gaggacaaco aatcaccaac tgtgtgaaga tgttgtgtac acacactggt 12840 

acaggacagg caattactgt aacaccagaa gctaacatgg accaagagtc ctttggtggt 12900 

gcttcatgtt gtctgtattg tagatgccac attgaccatc caaatcctaa aggattctgt 12960 

gacttgaaag gtaagtacgt ccaaatacct accacttgtg ctaatgaccc agtgggtttt. 13020 

acacttagaa acacagtotg taccgtctgc ggaatgtgga aaggttatgg ctgtagttgt 13080 

gaccaactcc gcgaaccctt gatgcagtct gcggatgcat caacgttttt aaacgggtfct 13140 

gcggtgtaag tgcagcccgt cttacaocgt gcggoacagg cactagtact gatgtcgtct 13200 

acagggcttt tgatatttac aacgaaaaag ttgctggttt tgcaaagttc otaaaaacta 13260 

attgctgtog cttccaggag aaggatgagg aaggcaattt attagaotct tactttgtag 13320 
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ttaagaggca tactatgtct aactaccaac atgaagagac tatttataac ttggttaaag 

attgtccagc ggttgctgtc catgactttt tcaagtttag agtagatggt gacatggtac 

cacatatatc aogtcagcgt ctaactaaat acacaatggc tgatttagtc tatgctctac 

gtcattttga tgagggtaat tgtgatacat taaaagaaat actcgtoaca tacaattgct 

tacgogtata tgctaactta ggtgagcgtg tacgccaatc attattaaag actgtacaat 

tctgogatgc tatgcgtgat gcaggcattg taggcgtact gacattagat aatcaggato 

ttaatgggaa ctggtacgat ttcggtgatt tcgtacaagt agcaccaggc tgcggagttc 

ctattgtgga ttcatattac tcattgctga tgcccatcct cactttgact agggcattgg 13860 

ctgctgagtc ccatatggat gctgatctcg caaaaccact tattaagtgg gatttgctga 13920 

aatatgattt taoggaagag agaotttgto tcttogaccg ttattttaaa tattgggacc 13980 

agacataoca tcocaattgt attaactgtt tggatgatag gtgtatcctt cattgtgcaa 14040 

aotttaatgt gttattttct actgtgtttc cacctacaag ttttggacca ctagtaagaa 

aaatatttgt agatggtgtt ccttttgttg tttcaactgg ataccatttt cgtgagttag 

gagtcgtaca taatcaggat gtaaacttao atagotcgcg tctoagttto aaggaaottt 14220 

tagtgtatgc tgctgatoca gotatgcatg oagcttctgg oaatttattg ctagataaac 14280 

gcactacatg cttttoagta gctgcactaa caaacaatgt tgcttttcaa actgtoaaac 

ccggtaattt taataaagac ttttatgact ttgotgtgtc taaaggtttc tttaaggaag 

gaagttctgt tgaactaaaa cacttcttct ttgctcagga tggcaacgct gctatcagtg 

attatgaota ttatcgttat aatctgccaa caatgtgtga tatoagacaa ctccta-ttcg 14520 

tagttgaagt tgttgataaa tactttgatt gttacgatgg tggctgtatt aatgccaacc 14580 

aagtaatcgt taacaatctg gataaatcag ctggtttccc atttaataaa tggggtaagg 

ctagacttta ttatgactca atgagttatg aggatcaaga tgcacttttc gcgtatacta 

agcgtaatgt catccctact ataaotoaaa tgaatcttaa gtatgccatt agtgcaaaga 

atagagctcg caocgtagct ggtgtotota tctgtagtac tatgacaaat agacagtttc 

atcagaaatt attgaagtca atagccgcca ctagaggagc tactgtggta attggaacaa 14880 

gcaagtttta cggtggctgg cataatatgt taaaaactgt ttacagtgat gtagaaactc 14940 

oacaccttat gggttgggat tatccaaaat gtgacagagc catgcctaac atgcttagga 15000 

taatggccto tcttgttctt gctcgcaaac ataacacttg ctgtaactta tcacacogtt 

tctacaggtt agctaacgag tgtgcgcaag tattaagtga gatggtcatg tgtggoggct 

cactatatgt taaaccaggt ggaacatcat ccggtgatgc tacaactgct tatgotaata 15180 

gtgtctttaa catttgtcaa gctgttacag ccaatgtaaa tgcacttctt tcaactgatg 15240 

gtaataagat agotgacaag tatgtccgca atctacaaoa caggctotat gagtgtctct 15300 

atagaaatag ggatgttgat catgaattcg tggatgagtt ttacgcttac ctgcgtaaac 15360 

atttctccat gatgattctt tctgatgatg ccgttgtgtg ctataacagt aactatgc 

ctcaaggttt agtagctagc attaagaact ttaaggcagt tctttattat caaaatae 

tgttcatgtc tgaggcaaaa tgttggactg agactgacot taotaaagga cctcacge 

tttgctcaca goataoaatg ctagttaaac aaggagatga ttacgtgtac ctgcctte 
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gtacacttat gattgaaagg ttcgtgtcac tggctattga tgottaccca cttacaaa 
atcotaatoa ggagtatgct gatgtctttc acttgtattt aoaatacatt agaaagttac 15780 
atgatgagct tactggccac atgttggaca tgtattccgt aatgctaact aatgataaca 15840 
cctcacggta ctgggaacct gagttttatg aggctatgta cacaccacat acagtcttgc 
aggctgtagg tgcttgtgta ttgtgcaatt cacagacttc acttogttgc ggtgcctgta 
ttaggagacc attoctatgt tgcaagtgct gotatgacca tgtcatttca acatcacaca 
aattagtgtt gtctgttaat ccctatgttt gcaatgcccc aggttgtgat gtcactgatg 
tgacacaact gtatctagga ggtatgagct attattgcaa gtcacataag cctcccatta 16140 
gttttccatt atgtgctaat ggtcaggttt ttggtttata oaaaaacaca tgtgtaggca 16200 
gtgacaatgt cactgacttc aatgogatag caacatgtga ttggactaat gctggcgatt 16260 
acatacttgc caacacttgt actgagagac tcaagctttt cgcagcagaa acgctcaaag 16320 
ccactgagga aacatttaag ctgtcatatg gtattgccac tgtacgcgaa gtactctctg 16380 
acagagaatt goatctttca tgggaggttg gaaaacctag accaccattg aacagaaact 16440 
atgtctttao tggttaccgt gtaactaaaa atagtaaagt aoagattgga gagtaoaoot 16500 
ttgaaaaagg tgactatggt gatgctgttg tgtacagagg taotaogaoa tacaagttga 
atgttggtga ttaotttgtg ttgacatctc acactgtaat gccacttagt gcacctactc 
tagtgccaca agagoactat gtgagaatta ctggcttgta cocaacactc aacatcteag 
atgagttttc tagcaatgtt gcaaattatc aaaaggtcgg catgcaaaag tactctacac 
tccaaggacc acctggtact ggtaagagtc attttgccat cggaottgct ctctattacc 16800 
catctgctcg catagtgtat acggcatgct ctcatgcagc tgttgatgcc ctatgtgaaa 16860 
aggcattaaa atatttgccc atagataaat gtagtagaat oatacctgcg cgtgcgcgcg 16920 
tagagtgttt tgataaattc aaagtgaatt caacactaga acagtatgtt ttctgcactg 16980 
taaatgcatt gccagaaaca actgotgaca ttgtagtctt tgatgaaatc tctatggcta 
ctaattatga cttgagtgtt gtcaatgota gaottcgtgo aaaacaotac gtctatattg 
gcgatcctgc tcaattacca gccccccgca cattgctgac taaaggcaca ctagaaccag 
aatattttaa ttcagtgtgc agacttatga aaacaatagg tccagacatg ttccttggaa 
cttgtcgccg ttgtcctgct gaaattgttg acaotgtgag tgctttagtt tatgacaata 
agctaaaagc acaoaaggat aagtcagctc aatgcttcaa aatgttctac aaaggtgtta 

ttacacgcaa tcctgcttgg agaaaagctg tttttatctc accttataat tcacagaacg 

ctgtagcttc aaaaatctta ggattgccta cgcagactgt tgattcatca cagggttctg 

aatatgacta tgtcatattc acaoaaaota ctgaaacagc aoactcttgt aatgtcaacc 

gcttcaatgt ggctatcaca agggoaaaaa ttggcatttt gtgcataatg tetgatagag 

atctttatga caaactgcaa tttacaagtc tagaaatacc acgtcgcaat gtggctacat 

tacaagcaga aaatgtaact ggacttttta aggactgtag taagatcatt actggtcttc 

atcctaoaca ggoaootaoa cacctcagcg ttgatataaa gttcaagact gaaggattat 

gtgttgacat aooaggcata ccaaaggaca tgaoctaoog tagaotcatc tctatgatgg 
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gtttcaaaat gaattaccaa gtcaatggtt accctaatat gtttatcacc cgcgaagaag 17940 

ctattcgtca cgttcgtgcg tggattggct ttgatgtaga gggctgtcat gcaactagag 18000 

atgotgtggg tactaaccta cctctccago taggatttto tacaggtgtt aacttagtag 18060 

ctgtaccgac tggttatgtt gacactgaaa ataacacaga attcaccaga gttaatgcaa 18120 

ggaatgtagt gcgtattaag atagtaoaaa tgctcagtga tacactgaaa ggattgtoag 18240 

acagagtcgt gttcgtcctt tgggcgcatg gctttgagot tacatoaatg aagtaotttg 18300 

tcaagattgg acctgaaaga acgtgttgtc tgtgtgacaa acgtgcaact tgcttttcta 18360 

cttcatcaga tacttatgcc tgctggaatc attctgtggg ttttgactat gtctataacc 18420 

catttatgat tgatgttcag cagtggggot ttacgggtaa ccttcagagt aaccatgacc 18480 

aacattgcca ggtacatgga aatgcacatg tggctagttg tgatgctatc atgactagat 18540 

gtttagcagt ccatgagtgc tttgttaagc gcgttgattg gtctgttgaa taccctatta 18600 

taggagatga actgagggtt aattctgctt gcagaaaagt acaacaoatg gttgtgaagt 18660 

ctgcattgct tgctgataag tttccagttc ttcatgaoat tggaaatcca aaggctatca 18720 

agtgtgtgoo tcaggctgaa gtagaatgga agttctacga tgotoagooa tgtagtgaca 18780 

aagottaoaa aatagaggaa ctcttctatt ottatgctac acatcaogat aaattcaotg 18840 

atggtgtttg tttgttttgg aattgtaacg ttgatcgtta occagccaat gcaattgtgt 18900 

gtaggtttga cacaagagtc ttgtcaaact tgaacttacc aggctgtgat ggtggtagtt 18960 

tgtatgtgaa taagcatgca ttccacactc cagctttcga taaaagtgca tttactaatt 19020 

taaagcaatt gcotttcttt tactattctg atagtccttg tgagtctcat ggcaaacaag 19080 

tagtgtcgga tattgattat gttccactca aatotgctac gtgtattaca cgatgcaatt 19140 

taggtggtgc tgtttgcaga caccatgcaa atgagtaccg acagtacttg gatgcatata 19200 

atatgatgat ttctgctgga tttagcctat ggatttacaa acaatttgat acttataacc 19260 

tgtggaatao atttaccagg ttacagagtt tagaaaatgt ggcttataat gttgttaata 19320 

aaggacactt tgatggacac gccggcgaag oacctgtttc catcattaat aatgctgttt 19380 

acacaaaggt agatggtatt gatgtggaga tctttgaaaa taagacaaca cttcctgtta 19440 

atgttgcatt tgagctttgg gctaagcgta acattaaacc agtgccagag attaagatac 19500 

tcaataattt gggtgttgat atcgctgcta ataotgtaat ctgggactac aaaagagaag 19560 

ccccagcaca tgtatctaoa ataggtgtct gcacaatgac tgacattgoo aagaaaccta 19620 

acctttttag aaacgcccgt aatggtgttt taataacaga aggttcagtc aaaggtctaa 19740 

caccttcaaa gggaccagca caagctagog tcaatggagt oacattaatt ggagaatcag 19800 

taaaaacaca gtttaactao tttaagaaag tagacggoat tattcaacag ttgoctgaaa 19860 

cctactttac tcagagoaga gacttagagg afrtttaagcc oagatcacaa atggaaactg 19920 

actttctcga gctcgctatg gatgaattca tacagcgata taagctcgag ggctatgcct 19980 

tcgaacacat cgtttatgga gatttcagtc atggacaact tggcggtctt catttaatga 20040 

taggcttago oaagcgotoa oaagattcac cacttaaatt agaggatttt atccctatgg 20100 

acagcacagt gaaaaattac ttoataaoag atgcgoaaac aggttoatca aaatgtgtgt 20160 
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gttctgtgat tgatctttta cttgatgact ttgtogagat aataaagtca caagatttgt 20220 

cagtgattto aaaagtggtc aaggttacaa ttgaotatgc tgaaatttoa ttcatgottt 20280 

ggtgtaagga tggacatgtt gaaaccttct accoaaaact acaagcaagt caagcgtggc 20340 

aaccaggtgt tgogatgcct aacttgtaca agatgcaaag aatgcttctt gaaaagtgtg 20400 

accttcagaa ttatggtgaa aatgctgtta taccaaaagg aataatgatg aatgtcgcaa 20460 

agtatactca actgtgtcaa taottaaata cacttacttt agctgtaccc tacaacatga 20520 

gagttattca ctttggtgct ggctctgata aaggagttgc accaggtaca gctgtgctoa 20580 

gacaatggtt gccaactggc acactacttg tcgattcaga tcttaatgao ttcgtctccg 20640 

acgcagattc tactttaatt ggagactgtg caacagtaca tacggctaat aaatgggacc 20700 

ttattattag cgatatgtat gaccctagga ccaaacatgt gacaaaagag aatgactcta 20760 

aagaagggtt tttcacttat ctgtgtggat ttataaagca aaaactagco ctgggtggtt 20820 

ctatagctgt aaagataaca gagcattctt ggaatgctga cctttacaag cttatgggcc 20880 

atttctcatg gtggacagct tttgttacaa atgtaaatgc atcatcatcg gaagcatttt 20940 

taattggggo taaotatctt ggcaagccga aggaacaaat tgatggctat accatgcatg 21000 

ctaactacat tttctggagg aacacaaatc ctatccagtt gtottcctat toaototttg 21060 

aoatgagcaa atttcctctt aaattaagag gaaotgotgt aatgtotctt aaggagaatc 21120 

aaatcaatga tatgatttat tctcttctgg aaaaaggtag gcttatcatt agagaaaaca 21180 

acagagttgt ggtttoaagt gatattcttg ttaacaacta a 21221 



<213> ORGANISM: CORONAVIRUS 



atggacccca atcaaaccaa ogtagtgccc cccgcattac atttggtgga cccacagatt 

caactgacaa taaccagaat ggaggacgca atggggcaag gccaaaacag cgccgacccc 

aaggtttacc caataatact gcgtcttggt tcacagctct cactcagcat ggcaaggagg 

aacttagatt ccctcgaggc cagggcgttc caatcaacac caatagtggt ccagatgacc 

aaattggcta ctaocgaaga gctacccgac gagttcgtgg tggtgacggc aaaatga 



n Thr Asn Val Val Pro Pro Ala Leu His Leu Val 

a Leu Thr He Thr Arg Met Glu Asp Ala Met Gly 
25 30 

r Ala Asp Pro Lys Val Tyr Pro He He Leu Arg 
3 Phe Gin Ser Thr Pro He Val Val Gin Met Thr 
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-continued 



Lys Leu Ala Thr Thr Glu Glu Leu Pro Asp Glu Phe V 



<400> SEQUENCE: 34 

atgctgccac cgtgctacaa cttcctcaag gaacaacatt gccaaaaggc ttctacgcag 
agggaagcag aggcggcagt caagcctott ctcgctcctc atcacgtagt cgcggtaatt 
caagaaattc aactcctggc agcagtaggg gaaattctcc tgctcgaatg gctagcggag 
gtggtgaaao tgccctcgcg ctattgctgc tag 



<213> ORGANISM: CORONAVIRUS 
<400> SEQUENCE: 35 

Met Leu Pro Pro Cys Tyr Asn Phe Leu Lys Glu Gin His Cys Gin Lys 
15 10 15 

Ala Ser Thr Gin Arg Glu Ala Glu Ala Ala Val Lys Pro Leu Leu Ala 
Pro His His Val Val Ala Val He Gin Glu He Gin Leu Leu Ala Ala 
Val Gly Glu lie Leu Leu Leu Glu Trp Leu Ala Glu Val Val Lys Leu 



Pro Ser Arg Tyr Cys Cys 
65 70 



<210> SEQ ID NO 36 

<211> LENGTH : 1377 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<220> FEATURE : 

<221> NAME /KEY : CDS 

<222> LOCATION: (67). .(1335) 

<223> OTHER INFORMATION: 

<4 00> SEQUENCE: 36 

atgaaggtoa ccaaactgot goatttac 



r Asp Asn Gly Pro Gin S 



Pro Asn Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin H 
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Ser Gly Pro Asp Asp Gin lie Gly Tyr Tyr Arg A 

Val Arg Gly Gly Asp Gly Lys Met Lys Glu Leu S 
95 100 105 

ttc tat tac eta gga act ggo cca gaa get tea c 
Phe Tyr Tyr Leu Gly Thr Gly Pro Glu Ala Ser L 



T° IT IT I 

130 135 140 

Pro Lys Asp His lie Gly Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr 



Val Leu Gin Leu Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala 
160 165 170 

Glu Gly Ser Arg Gly Gly Ser Gin Ala Ser Ser Arg Ser Ser Ser Arg 

175 180 185 190 



Ser Pro Ala Arg Met Ala S 



y Arg Arg Gly Pro Glu Gin Thr Gin Gly Asn 
280 285 

a ate aga eaa gga act gat tac aaa cat tgg 

a lie Arg Gin Gly Thr Asp Tyr Lys His Trp 
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370 



Leu Pro Gin Arg Gin Lys Lys 



t got gat tea act oag gca taa aoactoatga tgaccacaca 
r Ala Asp Ser Thr Gin Ala 



aggcagatgg gctatgtaaa eg 

<210> SEQ ID NO 37 

> TYPE : PRT 

> ORGANISM: 

<400> SEQUENCE: 37 

Met Ser Asp Asn Gly Pro Gin Ser Asn Gin Arg Ser Ala Pro Arg He 

Thr Phe Gly Gly Pro Thr Asp Ser Thr Asp Asn Asn Gin Asn Gly Gly 
20 25 30 

Arg Asn Gly Ala Arg Pro Lys Gin Arg Arg Pro Gin Gly Leu Pro Asn 
35 40 45 

Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin His Gly Lys Glu Glu 
50 55 60 

Leu Arg Phe Pro Arg Gly Gin Gly Val Pro He Asn Thr Asn Ser Gly 
65 70 75 80 

Pro Asp Asp Gin He Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Val Arg 
85 90 95 

Gly Gly Asp Gly Lys Met Lys Glu Leu Ser Pro Arg Trp Tyr Phe Tyr 
100 105 110 

Tyr Leu Gly Thr Gly Pro Glu Ala Ser Leu Pro Tyr Gly Ala Asn Lys 
115 120 125 

Glu Gly He Val Trp Val Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys 
130 135 140 

145 7 150 155 T 1 160 

Gin Leu Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 
165 170 175 

Ser Arg Gly Gly Ser Gin Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg 
180 185 190 

Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn Ser Pro 
195 200 205 

Ala Arg Met Ala Ser Gly Gly Gly Glu Thr Ala Leu Ala Leu Leu Leu 
210 215 220 

Leu Asp Arg Leu Asn Gin Leu Glu Ser Lys Val Ser Gly Lys Gly Gin 
225 230 235 240 

Gin Gin Gin Gly Gin Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser 



Lys Lys Pro Arg Gin Lys Arg Thr Ala T 
260 265 
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Gin Ala Phe Gly Arg Arg Gly Pro Glu Gin Thr Gin Gly Asn Phe Gly 
275 280 285 

Asp Gin Asp Leu lie Arg Gin Gly Thr Asp Tyr Lys His Trp Pro Gin 
290 295 300 

He Ala Gin Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg 

He Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr His Gly 

Ala He Lys Leu Asp Asp Lys Asp Pro Gin Phe Lys Asp Asn Val He 
340 345 350 

Leu Leu Asn Lys His He Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu 
355 360 365 

Pro Lys Lys Asp Lys Lys Lys Lys Thr Asp Glu Ala Gin Pro Leu Pro 
370 375 380 

Gin Arg Gin Lys Lys Gin Pro Thr Val Thr Leu Leu Pro Ala Ala Asp 
385 390 395 400 

Met Asp Asp Phe Ser Arg Gin Leu Gin Asn Ser Met Ser Gly Ala Ser 
405 410 415 

Ala Asp Ser Thr Gin Ala 

<211> LENGTH : 1377 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE : 38 

atgaaggtoa ocaaactgct gcatttagag acgtacttgt tgttttaaat aaacgaacaa 
attaaaatgt ctgataatgg accccaatoa aaccaacgta gtgocccoog oattacattt 
ggtggaccca cagattcaac tgacaataac oagaatggag gacgcaatgg ggcaaggcca 
aaacagcgcc gacoccaagg tttacocaat aataotgcgt cttggttcac agctctcact 
cagcatggca aggaggaact tagattocct cgaggccagg gcgttccaat caacaccaat 
agtggtccag atgacoaaat tggctaotao cgaagagcta cccgacgagt tcgtggtggt 
gacggcaaaa tgaaagagct cagccccaga tggtacttct attacctagg aactggccca 
gaagcttcac ttccctacgg cgctaacaaa gaaggcatcg tatgggttgc aactgaggga 
gocttgaata cacccaaaga ccacattggo acccgcaatc ctaataacaa tgctgccacc 
gtgctacaac ttcctoaagg aacaaoattg ccaaaaggct tctacgcaga gggaagcaga 
ggcggcagtc aagcotottc togctcctca tcacgtagtc gcggtaattc aagaaattca 
actcctggca goagtagggg aaattctcct gctcgaatgg ctagcggagg tggtgaaact 
gccctcgcgc tattgctgct agacagattg aaccagcttg agagcaaagt ttctggtaaa 
ggccaaoaao aaoaaggcca aactgtcact aagaaatctg ctgotgaggc atctaaaaag 
cctcgccaaa aacgtactgc oacaaaacag tacaacgtca ctoaagoatt tgggagacgt 
ggtccagaac aaacccaagg aaatttcggg gaccaagacc taatcagaca aggaactgat 
tacaaacatt ggccgcaaat tgcaoaattt gctccaagtg cctctgcatt ctttggaatg 
tcacgcattg gcatggaagt caoaccttog ggaacatggc tgacttatca t 
aaattggatg acaaagatoc aoaattcaaa gaoaacgtoa tactgctgaa 
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gaagctcagc ctttgccgca gagacaaaag aagcagccca ctgtgactct tcttcctgcg 
gctgacatgg atgatttctc cagacaactt caaaattcca tgagtggagc ttctgctgat 
tcaactcagg cataaacact catgatgacc acacaaggca gatgggctat gtaaacg 



<213> 
<400> SEQUENCE : 39 

atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 

ctctaaaoga actttaaaat ctgtgtagct gtcgctoggc tgcatgccta gtgcacctac 

goagtataaa caataataaa ttttaotgto gttgacaaga aacgagtaac tcgtccctot 
tctgcagact gcttacggtt tcgt 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 40 

actcaagcat ttgggagaog tggtccagaa caaacccaag gaaatttcgg ggaccaagac 

gcctctgcat tctttggaat gtoaogoatt ggoatggaag tcacaccttc gggaacatgg 

ctgacttatc atggagccat taaattggat gacaaagatc cacaattcaa agacaacgtc 

atactgctga acaagcacat tgacgcatao aaaacattcc cacoaacaga goctaaaaag 

gacaaaaaga aaaagactga tgaagctcag cctttgccgc agagacaaaa gaagcagccc 

actgtgactc ttcttcctgc ggctgacatg gatgatttct ccagacaact tcaaaattcc 

atgagtggag cttctgctga ttcaactcag gcataaacac tcatgatgac cacacaaggc 

agatgggcta tgtaaacgtt ttogcaattc cgtttacgat acatagtota otcttgtgca 

gaatgaa-ttc tcgtaactaa acagcacaag taggtttagt taactttaat ctcacatagc 

aatctttaat caatgtgtaa cattagggag gacttgaaag agccaccaca ttttcatcga 

ggccacgcgg agtacgatcg agggtacagt gaataatgct agggagagct gcctatatgg 

aagagcccta atgtgtaaaa ttaattttag tagtgctatc cccatgtgat -tttaatagct 
tcttaggaga atgacaaaaa aaaaaaaaa 



<210> SEQ ID NO 41 



aatgaacaca tagggctgtt caagctgggg cagtacgcct ttttccagct ctactagacc 

acaagtgcca tttttgaggt gttcacgtgc ctccgatagg gcctcttcca cagagtcccc 

gaagccacgc actagcacgt ctctaacctg aaggacaggc aaactgagtt ggacgtgtgt 

tttctcgttg acaccaagaa caaggctctc catcttacct ttcggtcaca cccggacgaa 
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acctaggtat gctgatgato gactgcaaca cggacgaaac cgtaagcagt ctgcagaaga 
gggacgagtt actcgtttct tgtcaacgac agtaaaattt attattgttt atactgcgta 
ggtgcactag gcatgcagoo gagcgacagc tacacagatt ttaaagttcg tttagagaac 
agatctacaa gagatcgagg ttggttgg 



<400> SEQUENCE: 42 

atacctaggt ttcgtccggg tgtgaccgaa aggtaagatg gagagccttg ttcttggtgt 

caacgagaaa acacaogtcc aaotcagttt gcctgtcctt caggttagag acgtgctagt 

gcgtggcttc ggggaototg tggaagaggo ootatcggag gcacgtgaac acctcaaaaa 

tggcacttgt ggtctagtag agctggaaaa aggcgtactg ccccagcttg aacagcccta 

ggttgcagaa atggacggca ttcagtacgg tcgtagcggt ataacactgg gagtactcgt 

gocaoatgtg ggcgaaaccc oaattgoata ccgoaatgtt cttcttcgta agaacggtaa 

taagggagoo ggtggtcata gctatggoat cgatotaaag tottatgact taggtgacga 

gcttggcact gatcccattg aagattatga acaaaactgg aacactaagc atggcagtgg 

caatttctgt ggcccagatg ggtaccctot tgattgcatc aaagattttc tcgcacgcgc 

gggcaagtca atgtgcactc tttccgaaca acttgattac atcgagtcga agagaggtgt 

ctactgctgc cgtgaocatg agoatgaaat tgcctggttc actgagcgct ctgataagag 

ctacgagcac cagacaccct tcgaaattaa gagtgccaag aaatttgaca ctttcaaagg 

ggaatgccca aagtttgtgt ttoctcttaa ctcaaaagtc aaagtcattc aaccacgtgt 

tgaaaagaaa aagactgagg gtttca-tggg gcgtatacgc tctgtgtacc ctgttgcatc 

tccacaggag tgtaacaata tgcacttgtc taccttgatg aaatgtaatc attgcgatga 

agtttcatgg cagacgtgcg actttctgaa agccacttgt gaacattgtg gcactgaaaa 

tttagttatt gaaggaccta ctacatgtgg gtacctacct actaatgctg tagtgaaaat 

gccatgtcct gcctgtcaag acccagagat tggacctgag catagtgttg cagattatca 

ctgtgtgttt gcctatgttg gotgctataa taagogtgcc tactgggttc ctcgtgctag 

tgctgatatt ggctcaggcc atactggcat tactggtgac aatgtggaga ccttgaatga 

ggatctcctt gagatactga gtcgtgaacg tgttaacatt aacattgttg gcgattttca 

tttgaatgaa gaggttgcoa tcattttggc atctttctct gcttctacaa gtgcctttat 

tgacactata aagagtcttg attaoaagtc tttcaaaacc attgttgagt cctgcggtaa 

ctataaagtt accaagggaa agcccgtaaa aggtgcttgg aacattggac aacagagatc 

agttttaaca ccactgtgtg gttttccctc acaggctgct ggtgttatca gatcaatttt 

tgcgcgcaca cttgatgcag oaaaccactc aattcctgat ttgcaaagag cagctgtcac 

catacttgat ggtatttctg aacagtcatt aogtottgtc gacgooa 
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-continued 

gacttctcag tggttgtcta atcttttggg cactactgtt gaaaaaotca ggcctatctt 1920 

tgaatggatt gaggcgaaac ttagtgcagg agttgaattt ctcaaggatg cttgggagat 1980 

tctcaaattt ctcattacag gtgtttttga catogtcaag ggtcaaatac agg 2033 

<210> SEQ ID NO 43 

<211> LENGTH: 2018 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<4 00> SEQUENCE: 4 3 

ggattgaggc gaaacttagt goaggagttg aatttctcaa ggatgcttgg gagattctca 60 

aatttctcat taoaggtgtt tttgacatcg tcaagggtca aatacaggtt gcttcagata 12 0 

acatcaagga ttgtgtaaaa tgottcattg atgttgttaa oaaggcaoto gaaatgtgca 180 

ttgatcaagt cactatcgct ggcgcaaagt tgcgatcact caacttaggt gaagtcttca 240 

tcgctcaaag oaagggactt taccgtcagt gtatacgtgg caaggagcag ctgcaactac 300 

tcatgcctct taaggcacca aaagaagtaa cctttcttga aggtgattca catgacacag 360 

taettaoctc tgaggaggtt gttctcaaga aoggtgaact cgaagcaetc gagacgcccg 420 

ttgatagott caoaaatgga gctatcgttg gcacaccagt ctgtgtaaat ggootoatgc 480 

tcttagagat taaggacaaa gaacaatact gcgcattgtc tcctggttta otggctacaa 540 

acaatgtctt tcgcttaaaa gggggtgcac caattaaagg tgtaaccttt ggagaagata 600 

ctgtttggga agttcaaggt tacaagaatg tgagaatcac atttgagctt gatgaacgtg 660 

ttgacaaagt gottaatgaa aagtgctctg totacactgt tgaatccggt accgaagtta 720 

ctgagtttgc atgtgttgta gcagaggctg ttgtgaagao tttacaacca gtttctgatc 780 

tccttaccaa catgggtatt gatcttgatg agtggagtgt agctacattc tacttatttg 840 

atgatgctgg tgaagaaaac ttttcatoac gtatgtattg ttccttttac cctccagatg 900 

aggaagaaga ggacgatgca gagtgtgagg aagaagaaat tgatgaaacc tgtgaacatg 960 

agtacggtac agaggatgat tatcaaggtc tccctotgga atttggtgcc tcagctgaaa 1020 

cagttcgagt tgaggaagaa gaagaggaag actggctgga tgatactact gagcaatcag 1080 

agattgagcc agaaccagaa cctacacctg aagaaccagt taatcagttt actggttatt 1140 

taaaacttac tgacaatgtt gccattaaat gtgttgacat cgttaaggag gcacaaagtg 1200 

ctaatcctat ggtgattgta aatgctgcta acatacacct gaaacatggt ggtggtgtag 1260 

caggtgcact caacaaggca accaatggtg coatgcaaaa ggagagtgat gattacatta 1320 

agctaaatgg ccctcttaca gtaggagggt cttgtttgct ttctggacat aatcttgcta 13B0 

agaagtgtct gcatgttgtt ggacctaaoc taaatgcagg tgaggacatc cagcttctta 1440 

aggcagcata tgaaaatttc aattcacagg acatcttaot tgcaccattg ttgtcagcag 1500 

gcatatttgg tgotaaacoa cttoagtott tacaagtgtg cgtgoagacg gttcgtacac 1560 

aggtttatat tgcagtcaat gacaaagctc tttatgagca ggttgtcatg gattatcttg 1620 

coaaaactga ggagaaatct gtcgtacaga agcotgtcga tgtgaagcca aaaattaagg 1740 

cctgcattga tgaggttacc acaacactgg aagaaaotaa gtttcttacc aataagttac 1800 
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tcttgtttgo tgatatoaat ggtaagcttt accatgattc tcagaaoatg cttagaggtg 1860 

aagatatgtc tttccttgag aaggatgcac cttacatggt aggtgatgtt atcactagtg 1920 

gtgatatcac ttgtgttgta ataccctcca aaaaggctgg tggcactact gagatgctct 1980 

caagagcttt gaagaaagtg ccagttgatg agtatata 2018 



<212> TYPE: DNA 
<213> ORGANISM: 

<4 00> SEQUENCE: 4 4 

ttgatgaggt taccacaaca ctggaagaaa ctaagtttct taccaataag ttactcttgt 
ttgctgatat caatggtaag ctttaccatg attctcagaa catgcttaga ggtgaagata 
tgtctttcct tgagaaggat gcaccttaca tggtaggtga tgttatcaot agtggtgata 
tcacttgtgt tgtaataccc tccaaaaagg ctggtggcac tactgagatg ctctcaagag 
ctttgaagaa agtgccagtt gatgagtata taaccacgta ccctggacaa ggatgtgctg 
gttataoact tgaggaagct aagactgctc ttaagaaatg caaatctgca ttttatgtac 
taccttoaga agcaootaat gctaaggaag agattctagg aaotgtatco tggaatttga 
gagaaatgot tgctcatgct gaagagacaa gaaaattaat gcctatatgo atggatgtta 
gagccataat ggcaaccatc oaaogtaagt ataaaggaat taaaattcaa gagggcatcg 
ttgactatgg tgtcogattc ttcttttata ctagtaaaga gcctgtagct tctattatta 
cgaagctgaa ctctctaaat gagccgcttg tcacaatgcc aattggttat gtgacacatg 
gttttaatot tgaagaggct gcgogctgta tgcgttctct taaagotoot googtagtgt 
cagtatcatc accagatgct gttactacat ataatggata cctcacttcg toatoaaaga 
catctgagga gcactttgta gaaacagttt ctttggctgg ctcttacaga gattggtcct 
attcaggaca gcgtacagag ttaggtgttg aatttcttaa gcgtggtgao aaaattgtgt 
accacactct ggagagcccc gtcgagtttc atcttgaogg tgaggttctt tcacttgaca 
aactaaagag totcttatcc ctgcgggagg ttaagactat aaaagtgttc acaactgtgg 
acaacactaa tctccacaca cagcttgtgg atatgtctat gacatatgga cagcagtttg 
gtccaacata cttggatggt gctgatgtta caaaaattaa acctcatgta aatcatgagg 
gtaagacttt ctttgtacta octagtgatg acacactacg tagtgaagct ttcgagtact 
accatactot tgatgagagt tttcttggta ggtacatgtc tgctttaaac cacacaaaga 
aatggaaatt tcctoaagtt ggtggtttaa ottoaattaa atgggctgat aacaattgtt 
atttgtctag tgttttatta gcacttcaac agcttgaagt caaattcaat gcaccagcac 
ttcaagaggc ttattataga gcccgtgctg gtgatgctgc taacttttgt gcactcatac 



> SEQUENCE : 45 

gtctat gacatatgga cagcagtttg gtccaacata cttggatggt gctgatgtta 
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-continued 

caaaaattaa acctcatgta aatcatgagg gtaagacttt ctttgtacta octagtgatg 

aoacactacg tagtgaagct ttcgagtact accatactct tgatgagagt tttcttggta 

ggtacatgtc tgctttaaao cacacaaaga aatggaaatt tcctcaagtt ggtggtttaa 

cttcaattaa atgggctgat aacaattgtt atttgtctag tgttttatta gcacttcaac 

agcttgaagt caaattcaat gcaccagcao ttcaagaggc ttattataga gcccgtgctg 

gtgatgctgc taacttttgt gcactcatac tcgcttacag taataaaact gttggcgagc 

ttggtgatgt cagagaaact atgacccatc ttctacagca tgotaatttg gaatctgcaa 

agcgagttot taatgtggtg tgtaaacatt gtggtcagaa aactactacc ttaacgggtg 

tagaagctgt gatgtatatg ggtactctat cttatgataa tcttaagaca ggtgtttcca 

ttccatgtgt gtgtggtogt gatgctacac aatatctagt acaacaagag tcttcttttg 

ttatgatgtc tgoaccacct gotgagtata aattacagca aggtacattc ttatgtgcga 

atgagtacac tggtaactat cagtgtggtc attacactca tataactgct aaggagaccc 

tctatcgtat tgacggagct caccttacaa agatgtcaga gtacaaagga ccagtgactg 

atgttttcta caaggaaaca tcttacacta caaccatcaa gcctgtgtcg tataaactcg 

atggagttac ttacacagag attgaaccaa aattggatgg g-tattataaa aaggataatg 

cttactatac agagcagoot atagaccttg taooaaotoa aocattaoca aatgogagtt 
ttgataattt oaaactcaca tgttctaaca 

<210> SEQ ID NO 46 

<211> LENGTH : 1995 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE : 46 

tttgtgcact catactcgct tacagtaata aaactgttgg cgagcttggt gatgtcagag 

aaactatgac ccatcttcta cagcatgcta atttggaatc tgcaaagcga gttcttaatg 

tggtgtgtaa acattgtggt cagaaaacta ctaccttaac gggtgtagaa gctgtgatgt 

atatgggtac tctatcttat gataatotta agacaggtgt ttcoattcca tgtgtgtgtg 

gtcgtgatgo tacacaatat ctagtacaac aagagtcttc ttttgttatg atgtctgcac 

cacctgctga gtataaatta cagcaaggta cattcttatg tgcgaatgag tacactggta 

actatcagtg tggtcattac actcatataa ctgctaagga gaccctctat cgtattgacg 

gagctcacct tacaaagatg tcagagtaca aaggaccagt gactgatgtt ttctacaagg 

agcctataga ccttgtacca actcaaccat taccaaatgc gagttttgat aatttcaaac 

tcacatgttc taacacaaaa tttgctgatg atttaaatca aatgacaggc ttcacaaagc 

cagottcacg agagotatot gtcacattct tcccagactt gaatggcgat gtagtggcta 

ttgactatag acactattca gcgagtttca agaaaggtgc taaattactg cataagccaa 

ttgtttggca ca-ttaaccag gctacaacca agacaacgtt caaaccaaac acttggtgtt 

tacgttgtct ttggagtaca aagocagtag atacttcaaa ttoatttgaa gttctggcag 
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aagttgtagg caatgtcata cttaaaccat cagatgaagg tgttaaagta acacaagagt 1140 

taggtcatga ggatcttatg gotgottatg tggaaaacac aagcattacc attaagaaac 1200 

ctaatgagct ttcactagcc ttaggtttaa aaacaattgc cactcatggt attgctgcaa 1260 

ttaatagtgt tccttggagt aaaattttgg cttatgtcaa accattctta ggacaagcag 1320 

ottatgtgtt taoattattg ttccaattgt gtacttttao taaaagtaco aattctagaa 1440 

ttagagcttc actacctaca actattgcta aaaatagtgt taagagtgtt gctaaattat 1500 

gtttggatgc cggcattaat tatgtgaagt cacccaaatt ttctaaattg ttcacaatcg 1560 

ctatgtggct attgttgtta agtatttgct taggttctct aatotgtgta actgotgctt 1620 

ttggtgtact cttatctaat tttggtgcto cttcttattg taatggcgtt agagaattgt 1680 

atcttaattc gtctaaogtt actactatgg atttctgtga aggttctttt ccttgcagca 1740 

cgatttcato gtacaagcta gacttgacaa ttttaggtct ggcogctgag tgggttttgg 1860 

oatatatgtt gttcacaaaa ttottttatt tattaggtct ttcagotata atgcaggtgt 1920 

totttggcta ttttgctagt oatttcatca gcaattcttg gotcatgtgg tttatcatta 1980 

gtattgtaca aatgg 1995 

<211> LENGTH: 1884 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE : 47 

aattcttggc tcatgtggtt tatcattagt attgtacaaa tggcacccgt ttctgcaatg 60 

gttaggatgt acatcttctt tgcttctttc tactacatat ggaagagcta tgttcatatc 120 

atggatggtt goaootcttc gacttgcatg atgtgctata agcgcaatcg tgccacacgc 180 

ggccgtggct tctgcaagac tcacaattgg aattgtctca attgtgacac at-tttgcact 300 

ggtagtacat tcattagtga tgaagttgct cgtgatttgt cactccagtt taaaagacca 360 

atcaacccta ctgaooagto atcgtatatt gttgatagtg ttgctgtgaa aaatggcgcg 420 

cttcacctct actttgacaa ggctggtcaa aagacctatg agagacatcc gctctcccat 480 

tttgtcaatt tagacaattt gagagctaac aacactaaag gttcactgcc tattaatgtc 540 

atagtttttg atggcaagtc caaatgcgac gagtctgctt ctaagtctgc ttctgtgtac 600 

tacagtcagc tgatgtgaca acctattctg ttgcttgacc aagctcttgt atcagacgtt 660 

ggagatagta ctgaagtttc cgttaagatg tttgatgctt atgtcgacac cttttcagca 720 

acttttagtg ttootatgga aaaacttaag gcacttgttg otacagotca cagcgagtta 780 

goaaagggtg tagctttaga tggtgtcctt tctacattcg tgtoagctgc ccgacaaggt 840 

gttgttgata ccgatgttga cacaaaggat gttattgaat gtctcaaact ttcacatcac 900 

tctgaottag aagtgacagg tgacagttgt aaoaatttoa tgctcaccta taataaggtt 960 

gaaaacatga cgcocagaga tcttggcgca tgtattgact gtaatgcaag gcatatoaat 1020 
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-continued 

gcccaagtag caaaaagtca oaatgtttca ctcatctgga atgtaaaaga ctacatgtct 1080 

ttatotgaac agotgcgtaa acaaattcgt agtgctgcca agaagaacaa cataoctttt 1140 

agactaactt gtgctacaac tagacaggtt gtoaatgtoa taaotaotaa aatotoaotc 1200 

aagggtggta agattgttag tacttgtttt aaacttatgc ttaaggccac attattgtgc 1260 

gttcttgctg cattggtttg ttatatcgtt atgccagtac atacattgtc aatcoatgat 1320 

ggttacacaa atgaaatcat tggttacaaa gccattcagg atggtgtcac tcgtgacatc 1380 

atttctactg atgattgttt tgcaaataaa catgctggtt ttgacgcatg gtttagccag 1440 

cgtggtggtt catacaaaaa tgacaaaagc tgcoctgtag tagctgctat cattacaaga 1500 

gagattggtt tcatagtgcc tggottaccg ggtaotgtgc tgagagcaat caatggtgac 1560 

ttcttgcatt ttctaoctcg tgttttfcagt gctgttggca acatttgcta cacaccttcc 1620 

aaaotoattg agtatagtga ttttgctacc tctgottgcg ttcttgctgc tgagtgtaca 1680 

atttttaagg atgctatggg caaacctgtg ccatattgtt atgacactaa tttgctagag 1740 

ggttctattt cttatagtga gcttcgtcca gacactcgtt atgtgcttat ggatggttoo 1800 

atoatacagt ttcotaacao ttacctggag ggttctgtta gagtagtaac aacttttgat 1860 

gctgagtact gtagaoatgg taoa 1884 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 48 

cactcgttat gtgcttatgg atggttccat catacagttt cctaacactt acctggaggg 60 

ttctgttaga gtagtaacaa cttttgatgc tgagtactgt agacatggta catgcgaaag 120 

gtcagaagta ggtatttgcc tatctaccag tggtagatgg gttcttaata atgagcatta 180 

cagagc-tcta tcaggagttt tctgtggtgt tgatgcgatg aatctcatag ctaacatctt 240 

tactcctctt gtgcaacctg tgggtgottt agatgtgtct gcttcagtag tggctggtgg 300 

tattattgco atattggtga cttgtgctgo ctactacttt atgaaattca gacgtgtttt 360 

tggtgagtac aaccatgttg ttgctgctaa tgcacttttg tttttgatgt ctttcactat 420 

actctgtctg gtaccagctt acagctttct gccgggagtc tactcagtct tttacttgta 480 

cttgacattc tatttcacca atgatgtttc a-ttcttggct caccttcaat ggtttgcca-t 540 

gttttctcct attgtgcctt tttggataac agcaatctat gtattctgta tttctctgaa 600 

atttagtacc ttcgaggagg ctgctttgtg tacctttttg ctcaacaagg aaatgtacct 720 

aaaattgcgt agcgagacac tgttgccact tacacagtat aacaggtatc ttgctctata 780 

taacaagtac aagtatttca gtggagcctt agatactacc agotatogtg aagcagcttg 840 

ctgccactta gcaaaggctc taaatgactt tagoaaotca ggtgotgatg ttctctacca 900 

accaccacag acatcaatca cttctgctgt tctgcagagt ggttttagga aaatggcatt 960 

cccgtcaggc aaagttgaag ggtgcatggt acaagtaacc tgtggaacta caactcttaa 1020 

tggattgtgg ttggatgaca cagtatactg tccaagacat gtcatttgca cagcagaaga 1080 

catgcttaat cctaactatg aagatctgct cattcgcaaa tccaaccata gctttcttgt 1140 
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-continued 

gcttaaagtt gatacttcta acootaagac acccaagtat aaatttgtcc gtatccaacc 

tggtcaaaca ttttcagttc tagcatgcta caatggttca ccatctggtg tttatcagtg 

tgccatgaga cctaatcata ccattaaagg ttctttcctt aatggatcat gtggtagtgt 

tggttttaac attgattatg attgcgtgtc tttctgctat atgcatcata tggagcttcc 

aacaggagta cacgctggta ctgacttaga aggtaaattc tatggtccat ttgttgacag 

aoaaaotgoa caggctgcag gtacagacao aaccataaca ttaaatgttt tggcatggct 

gtatgctgct gttatcaatg gtgataggtg gtttcttaat agattcacca ctactttgaa 

tgactttaac cttgtggcaa tgaagtacaa ctatgaacct ttgacacaag atcatgttga 

catattggga cctctttctg ctcaaaoagg aattgccgtc ttagatatgt gtgctgcttt 

gaaagagctg ctgcagaatg gtatgaatgg tcgtactatc cttggtagca ctattttaga 

agatgagttt acaccatttg atgttgttag acaatgctct ggtgttacct tccaaggtaa 

gttcaagaaa attgttaagg gcactcatca ttggatgctt ttaactttct tgacatcact 

attgattctt gttcaaagta cacagtggtc actgtttttc tttgtttacg agaatgcttt 

cttgccattt actcttggta ttatggcaat tgctgcatgt 

<210> SEQ ID NO 49 



<400> SEQUENCE: 49 

agoatttcoa gcctgaagac gtactgtagc agotaaactg cccagcacca tacctctatt 
■taggttgttt aagcctttga tgaagtacaa gtatttcact ttaggocctt ttggtgtgto 
tgtaacaaac ctacaaggtg gttccagttc tgtgtaaatt gtacctgtac catcactctt 
agggaatcta gcccatttga gatcttggtg gtctgatagt aatgccagca caaacctacc 
tcccttcgaa ttgttatagt aggoaagtgc attgtcatoa gtaoaagotg tttgtgtggt 
accagcogca caggacatct gtcgtagtgc tactggactc agttoattat tctgtagttt 
aacagctgag ttggctctta gagctgtaac aataagaggc caagccaaat ttggtgaatt 
gtccatgtta atttcactaa gttgaacaat cttgctatcc gcatcaacaa cttgctggat 
ttcccagagt gcagatgcat atgtaaaggt gttaccatca caagtgttct tgtaggtacc 
ataatcaggg acaacaacca tgagtttggc tgctgtagtc aatggtatga tgttgagtgg 
aacacaacca toacgogcat tgttgataat gttgttaagt gcatcattat caagcttcct 
aagcatagtg aagagcattg tttgcatagc actagttact tttgccctct tgtcctcaga 
tcttgcctgt ttgtacattt gggtcatagc ctgatctgcc atcttttcca acttgcgttg 
catggcagca tcacggtoaa aotcagattt agccaoattc aaagatttct ttaacttttt 
gagaaogact toagaatcac cattagotac agcctgotoa taggcctoot gggcagtggc 
ataagcggca tatgatggta aagaactaaa ttctgaagca atagcctgaa gag-tagcacg 

aacagaoaaa agagaaacca tcttctcgaa agcttcagtt gtgtcttttg caagaagaat 
atcattgtgg agttgtacac attgtgccca caatttagaa gatgactcta ctctaagttg 
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agtagcaacc ttgatacatg gtttacctcc aatacccaac aacttaatgt taagcttgaa 
agcatcaata ctactcttag gaggcaaaag cccctgggag ttcatatacc taaattcttg 
tgtagagacc aagtagtcat aaacaccaag agtaagcctg aagtaacggt tgagtaaaca 



agctctagct aaaaacatga tagtcgtaac gacaccagaa tagttagagg ttacagaaat 
aactaaggcc cacatggaaa tagcttgatc taaagcatta ccatagtaga ctttgtaaac 
aagtgtaatg acattcatca gtgtccaaac acgtctagca gcatcatcat aaacagtgcg 
agctgtcatg agaataagca aaactaaagc tgaagcatac ataacacaat ccttaagcct 
ataaccagac aagctagtgt cagccaattc aagccatgtc atgatacgca tcacccagct 
agcaggcatg tagaccatat taaagtaagc aactgttgca agagaaggta acagaaacaa 
gcacaagaat gcgtgcttat gcttaacaag cagcatagca catgcagcaa ttgccataat 
accaagagta aatggcaaga aagcattctc gtaaacaaag aaaaacagtg accactgtgt 
actttgaaca agaatcaata gtgatgtcaa gaaagttaaa agcatccaat gatgagtgca 

<210> SEQ ID NO 50 

> LENGTH: 20 

> TYPE: DNA 

> ORGANISM: 

<400> SEQUENCE: 

cttgtaggtt tgttacagac acaccaaaag ggcctaaagt gaaatacttg t 
aaggcttaaa caacctaaat agaggtatgg tgctgggcag tttagctgct acagtacgtc 
ttcaggctgg aaatgctaca gaagtacctg ccaattcaac tgtgctttcc ttctgtgctt 
ttgcagtaga ccctgctaaa gcatataagg attacctagc aagtggagga caaccaatca 
ccaactgtgt gaagatgttg tgtacacaca ctggtacagg acaggcaatt actgtaacac 
cagaagctaa catggaccaa gagtcctttg gtggtgcttc atgttgtctg tattgtagat 
gccacattga ccatccaaat cctaaaggat tctgtgactt gaaaggtaag tacgtccaaa 
tacctaccac ttgtgctaat gacccagtgg gttttacact tagaaacaca gtctgtaccg 
tctgcggaat gtggaaaggt tatggctgta gttgtgacca actccgcgaa cccttgat.gc 
agtctgcgga tgcatcaacg tttttaaacg ggtttgcggt gtaagtgcag cccgtcttac 
accgtgcggc acaggcacta gtactgatgt cgtctacagg gcttttgata tttacaacga 
aaaagt-tgct ggttttgcaa agttcctaaa aactaattgc tgtcgcttcc aggagaagga 
tgaggaaggc aatttattag actcttactt tgtagttaag aggcatacta tgtctaacta 
ccaacatgaa gagactattt ataacttggt taaagattgt ccagcggttg ctgtccatga 
ctttttcaag tttagagtag atggtgacat ggtaccacat atatcacgtc agcgtctaac 
taaatacaca atggctgatt tagtctatgc tctacgtcat tttgatgagg gtaattgtga 
tacattaaaa gaaatactcg tcacatacaa ttgctgtgat gatgattatt tcaataagaa 
ggattggtat gacttcgtag agaatcctga catcttacgc gtatatgcta acttaggtga 
gcgtgtacgc caatcattat taaagactgt acaattctgc gatgctatgc gtgatgcagg 
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gctgatgccc atcctoactt tgactagggc attggctgct gagtoooata tggatgotga 1320 

tctcgcaaaa ccacttatta agtgggattt gctgaaatat gattttacgg aagagagact 1380 

ttgtctcttc gaccgttatt ttaaatattg ggaccagaca taccatccca attgtattaa 1440 

ctgtttggat gataggtgta tccttcattg tgcaaacttt aatgtgttat tttctactgt 1500 

gtttccacot acaagttttg gaccactagt aagaaaaata tttgtagatg gtgttccttt 1560 

tgttgtttca actggataoc attttcgtga gttaggagtc gtacataato aggatgtaaa 162 0 

cttacatagc tcgcgtctca gtttcaagga acttttagtg tatgctgctg atccagctat 168 0 

gcatgcagct tctggcaatt tattgctaga taaacgcact acatgctttt cagtagctgc 1740 

actaaoaaao aatgttgctt ttcaaactgt caaacccggt aattttaata aagactttta 1800 

tgactttgct gtgtctaaag gtttctttaa ggaaggaagt tctgttgaac taaaacactt 1860 

cttctttgct caggatggca acgctgctat cagtgattat gactattatc gttataatct 1920 

gccaacaatg tgtgatatca gacaactcct attcgtagtt gaagttgttg ataaataott 1980 

:t gtattaatgc ca 2012 

0 51 



<400> SEQUENCE: 51 

gtaottcgcg tacagtggca ataccatatg acagottaaa tgtttcctca gtggctttga 
gcgtttctgc tgcgaaaagc ttgagtctct cagtacaagt gttggcaagt atgtaatcgc 
cagcattagt ccaatcacat gttgctatcg cattgaagtc agtgacattg tcactgccta 
cacatgtgtt tttgtataaa ccaaaaacct gaccattagc acataatgga aaactaatgg 
gaggcttatg tgacttgcaa taatagctca tacotcctag atacagttgt gtcacatcag 
tgacatcaca acctggggca ttgcaaacat agggattaac agacaacact aatttgtgtg 
atgttgaaat gacatggtca tagcagcact tgcaacatag gaatggtctc ctaatacagg 
caccgcaacg aagtgaagtc tgtgaattgc acaatacaca agcacctaca gcctgcaaga 
otgtatgtgg tgtgtacata gcctoataaa actcaggttc ccagtaccgt gaggtgttat 
oattagttag cattacggaa tacatgtcca acatgtggcc agtaagctca tcatgtaact 
ttctaatgta ttgtaaatac aagtgaaaga catcagcata ctcctgatta ggatgttttg 
taagtgggta agcatcaata gccagtgaca cgaacctttc aatcataagt gtaccatctg 
ttttgacaat atcatcgaca aaacagcctg cgcctaatat tcttgatgga tctgggtaag 
gcaggtacac gtaatcatct ccttgtttaa ctagcattgt atgctgtgag caaaattcgt 
gaggtccttt agtaaggtoa gtctcagtcc aacattttgc ctcagaoatg aacacattat 
tttgataata aagaactgcc ttaaagttct taatgctagc tactaaacct tgagccgcat 
agttactgtt atagcacaca acggcatcat cagaaagaat catcatggag aaatgtttac 
gcaggtaago gtaaaactca tccaogaatt catgatcaac atccctattt ctatagagac 
actcatagag cctgtgttgt agattgcgga catacttgto agctatctta ttaccatcag 
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ttgaaagaag tgcatttaca ttggctgtaa cagcttgaca aatgttaaag acactattag 1200 

cacacatgac catctcactt aatacttgcg oacactcgtt agctaacctg tagaaacggt 1320 

gtgataagtt acagcaagtg ttatgtttgc gagcaagaac aagagaggcc attatcctaa 1380 

gcatgttagg catggctctg tcacattttg gataatccca aocoataagg tgtggagttt 1440 

ctaoatcaot gtaaacagtt tttaacatat tatgccagco accgtaaaac ttgcttgtto 1500 

caattaccao agtagctcct ctagtggogg ctattgactt oaataatttc tgatgaaact 1560 

gtctatttgt catagtacta cagatagaga caccagctac ggtgcgagct ctattctttg 1620 

caotaatggc atacttaaga ttcatttgag ttatagtagg gatgacatta ogcttagtat 1680 

acgcgaaaag tgcatcttga tcctcataac tcattgagto ataataaagt ctagccttac 1740 

cccatttatt aaatgggaaa ooagctgatt tatccagatt gttaacgatt acttggttgg 1800 

cattaataca gccaccatcg taacaatcaa agtatttatc aacaacttca actacgaata 1860 

ggagttgtct gatatca 1877 

<210> SEQ ID NO 52 



<400> SEQUENCE: 52 

tcaggtocaa tottgacaaa gtacttcatt gatgtaagct caaagccatg cgcccaaagg 
acgaacacga ctctgtctga caatcctttc agtgtatcac tgagoatttg tactatctta 
atacgcacta cattccaggg caagccttta tacatgagtg gtataagatg tttaaactgg 
tcacctggtg gaggttttgc attaactctg gtgaattctg tgttattttc agtgtcaaca 
taaccagtcg gtacagctac taagttaaca cctgtagaaa atcctagctg gagaggtagg 
ttagtaccca cagcatctct agttgcatga cagccctcta catcaaagcc aatccacgca 
cgaacgtgac gaatagcttc ttcgcgggtg ataaaoatat tagggtaacc attgacttgg 
taattcattt tgaaacccat catagagatg agtotaoggt aggtcatgtc ctttggtatg 
cctggtatgt caacacataa tccttcagtc ttgaacttta tatcaacgct gaggtgtgta 
ggtgcctgtg taggatgaag accagtaatg atcttactac agtccttaaa aagtccagtt 
acattttctg cttgtaatgt agcoacattg ogacgtggta tttctagact tgtaaattgc 
agtttgtoat aaagatotct atcagacatt atgcacaaaa tgccaatttt tgcccttgtg 
atagccacat tgaagcggtt gacattacaa gagtgtgctg tttcagtagt ttgtgtgaat 
atgacatagt catattcaga accctgtgat gaatcaacag tctgcgtagg caatcctaag 
atttttgaag ctacagcgtt ctgtgaatta taaggtgaga taaaaacagc ttttctocaa 
gcaggattgc gtgtaagaaa ttctcttaca acgcctattt gaggtctgtt gattgcagat 
gaaaoatcat gtgtaataac aoctttgtag aacattttga agcattgagc tgacttatco 
ttgtgtgctt ttagcttatt gtcataaact aaagcactca cagtgtcaac aatttcagoa 
ggacaacggc gacaagttcc aaggaacatg tctggaccta ttgttttcat aagtctgcac 
actgaattaa aatattctgg ttotagtgtg cctttagtca gcaatgtgcg gggggctggt 
aattgagcag gatogacaat atagacgtag tgttttgcac gaagtctagc attgacaaca 
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-continued 

ctcaagtcat aattagtagc catagagatt tcatoaaaga otacaatgtc agcagttgtt 1320 

tctggcaatg catttacagt gcagaaaaoa tactgttota gtgttgaatt cactttgaat 1380 

ttatcaaaac actotacgog cgcacgcgca ggtatgatto tactacattt atctatgggc 1440 

aaatatttta atgccttttc acatagggca tcaacagctg catgagagca tgccgtatac 1500 

ccaggtggtc cttggagtgt agagtacttt tgcatgccga ccttttgata atttgcaaca 1620 

ttgctagaaa actcatctga gatgttgagt gttgggtaca agccagtaat totoacatag 1680 

tgctcttgtg gcactagagt aggtgcacta agtggcatta cagtgtgaga tgtcaacaca 1740 

aagtaatcac caacattcaa cttgtatgtc gtagtacctc tgtacacaac agcatcacca 1800 

tagtcacctt tttcaaaggt gtactctcca atctgtactt tactattttt agttacacgg 1860 

taaccagtaa agacatagtt totgttoaat ggtggtotag gttttccaac otcccatgaa 1920 

agatgcaatt ctctgtcaga gagtacttcg cgtacagtgg caataccata tgacagctta 1980 

aatgtttcct cagtggcttt gagcgtttct gctgcgaaaa gcttgagtct ctcagtacaa 2040 

gtgttggcaa g 2051 

<2U> LENGTH: 2075 
<212> TYPE: DNA 



<400> SEQUENCE: 53 

tgcttgtagt tttgggtaga aggtttcaac atgtccatcc ttacaccaaa gcatgaatga 60 

aatttcagca tagtcaattg taaccttgac cacttttgaa atoaotgaca aatcttgtga 120 

ctttattatc togacaaagt oatcaagtaa aagatoaato acagaaoaca cacattttga 180 

tgaacctgtt tgcgcatctg ttatgaagta atttttcact gtgctgtcca tagggataaa 240 

atcctctaat ttaagtggtg aatcttgtga gcgottggct aagcctatca ttaaatgaag 300 

accgccaagt tgtccatgac tgaaatctcc ataaacgatg tgttcgaagg catagccctc 360 

gagcttatat cgctgtatga attcatccat agcgagctcg agaaagtcag tttccatttg 420 

tgatctgggc ttaaaatcct ctaagtctct gctctgagta aagtaggttt caggcaactg 480 

ttgaataatg ccgtctactt tcttaaagta gttaaactgt gtttttactg attctccaat 540 

taatgtgact ccattgacgc tagcttgtgc tggtcccttt gaaggtgtta gacctttgac 60 0 

cactctacoa toaaaoaaga cagtaagtga agaacaagca ctctcagtag gtttcttggc 720 

aatgtcagtc attgtgcaga cacctattgt agatacatgt gctggggctt ctcttttgta 780 

gtcccagatt acagtattag cagcgatatc aaoacccaaa ttattgagta tcttaatctc 840 

tggoactggt ttaatgttac gottagccoa aagotoaaat gcaacattaa caggaagtgt 900 

tgtcttattt toaaagatot ccacatcaat acoatotaco tttgtgtaaa cagcattatt 960 

aatgatggaa acaggtgctt cgccggcgtg tccatcaaag tgtcctttat taacaacatt 1020 

ataagccaca ttttctaaac tctgtaacct ggtaaatgta ttccacaggt tataagtatc 1080 

gtactgtogg tactcatttg catggtgtct gcaaacagca ccacctaaat tgcatcgtgt 1200 



US 2007/0275002 Al 



-continued 

aatacacgta gcagatttga gtggaacata atcaatatcc gacactactt gtttgccatg 1260 

agactcaoaa ggaotatoag aatagtaaaa gaaaggcaat tgctttaaat tagtaaatgc 1320 

acttttatcg aaagctggag tgtggaatgc atgcttattc acatacaaac taccaccatc 1380 

acagcctggt aagttcaagt ttgacaagac tcttgtgtca aacotacaca caattgcatt 1440 

ggctgggtaa cgatcaacgt tacaattcca aaacaaacaa acaccatcag tgaatttatc 1500 

gtgatgtgta gcataagaat agaagagttc ctctattttg taagctttgt oactacatgg 1560 

ctgagcatcg tagaacttcc attctaotto agootgaggc acacacttga tagootttgg 1620 

atttccaatg tcatgaagaa ctggaaactt atcagcaagc aatgcagact tcacaaccat 1680 

gtgttgtact tttctgcaag cagaattaac cctcagttca tctcctataa tagggtattc 1740 

aacagaccaa tcaacgcgct taacaaagca ctcatggact gctaaacato tagtcatgat 1800 

agcatcacaa ctagccacat gtgcatttcc atgtacctgg caatgttggt catggttact 1860 

ctgaaggtta cccgtaaagc cccactgctg aacatcaatc ataaatgggt tatagacata 1920 

gtcaaaaccc acagaatgat tccagcaggc ataagtatct gatgaagtag aaaagcaagt 1980 

tgcacgtttg tcacacagac aacacgttct ttcaggtcca atcttgacaa agtacttcat 2040 

tgatgtaagc tcaaagccat gcgcccaaag gacga 2075 

<210> SEQ ID NO 54 

<212> TYPE: DNA 

<213> ORGANISM: CORONAVIRUS 

<4 00> SEQUENCE : 54 

aagattcacc acttaaatta gaggatttta tcoctatgga oagoacagtg aaaaattact 60 

tcataacaga tgcgcaaaca ggttcatcaa aatgtgtgtg ttctgtgatt gatcttttac 120 

ttgatgactt tgtcgagata ataaagtcac aagatttgtc agtgatttca aaagtggtca 180 

aggttacaat tgactatgct gaaatttcat tcatgctttg gtgtaaggat ggacatgttg 240 

aaaccttcta cccaaaacta caagcaagtc aagcgtggca accaggtgtt gcgatgoota 300 

atgctgttat accaaaagga ataatgatga atgtcgcaaa gtatactcaa ctgtgtcaat 420 

acttaaatac acttacttta gctgtaccct acaacatgag agttattcac tttggtgctg 480 

gctctgataa aggagttgca ccaggtacag ctgtgctcag acaatggttg ccaactggca 540 

cactacttgt cgattcagat cttaatgact tcgtctccga cgcagattct actttaattg 600 

gagactgtgc aacagtacat acggctaata aatgggacct tattattagc gatatgtatg 660 

accctaggac caaacatgtg acaaaagaga atgactctaa agaagggttt ttcacttatc 720 

tgtgtggatt tataaagcaa aaactagccc tgggtggttc tatagctgta aagataacag 780 

agcattottg gaatgctgac ctttacaagc ttatgggcca tttctcatgg tggacagctt 840 

ttgttacaaa tgtaaatgca tcatcatcgg aagcattttt aattggggct aactatcttg 900 

gcaagccgaa ggaacaaatt gatggctata ccatgcatgc taactacatt ttctggagga 960 

acacaaatcc tatccagttg tcttcctatt cactctttga catgagcaaa tttcctctta 1020 

aattaagagg aactgctgta atgtctctta aggagaatca aatcaatgat atgatttatt 1080 

ctcttctgga aaaaggtagg cttatcatta gagaaaacaa cagagttgtg gtttcaagtg 1140 
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-continued 

gtggtagtga ccttgaccgg tgcaccactt ttgatgatgt tcaagctcct aattacactc 

aacatacttc atctatgagg ggggtttact atcctgatga aatttttaga toagacacto 

tttatttaac tcaggattta tttottccat tttattctaa tgttacaggg tttcatacta 

ttaatcatac gtttggcaac cctgtcatac cttttaagga tggtatttat tttgctgcca 

cagagaaatc aaatgttgtc cgtggttggg tttttggttc taccatgaac aacaagtcao 

agtcggtgat tattattaac aattctacta atgttgttat acgagcatgt aactttgaat 

tgtgtgaoaa ccctttcttt gctgtttcta aacccatggg tacacagaca catactatga 

tattcgataa tgcatttaat tgcactttcg agtacatatc tgatgccttt tcgcttgatg 

tttcagaaaa gtcaggtaat tttaaacact tacgagagtt tgtgtttaaa aataaagatg 

ggtttctcta tgtttataag ggctatcaac ctatagatgt agttcgtgat ataccttctg 

gttttaacac tttgaaacct atttttaagt tgcctcttgg tattaacatt acaaatttta 

gagccattct tacagccttt tcacctgctc a 

<211> LENGTH: 32 

<213> ORGANISM: artificial sequence 

<220> FEATURE : 
<400> SEQUENCE: 55 

<210> SEQ ID NO 56 



<223> OTHER INFORMATION: N antisens primer 



cccccgggtg cctgagttga atcagcagaa gc 
<210> SEQ ID NO 57 

artificial sequence 



» SEQUENCE : 
tatgag tgacc 



<210> SEQ ID NO 58 

<211> LENGTH: 30 

<212> TYPE: DNA 

<213> ORGANISM: artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: SL sens primer 



<400> SEQUENCE: 58 
cccatatgaa accttgcac 
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<211> LENGTH: 33 
<212> TYPE: DNA 
<213> ORGANISM: 



ccgggtt taatatattg ctcatatttt c 



<211> LENGTH : 



Antisens set 2 (28774-28759) primer 
> SEQUENCE : 61 



<400> SEQUENCE: 62 
ggctactacc gaagag 



:isens set 2 (28702-28687 Jprimer 
c400> SEQUENCE: 6 3 



<213> ORGANISM: Probe 1 



: Probe 2/set 1 (28588-28608) 
<400> SEQUENCE : 65 
gccaccgtgc tacaacttcc t 



Probe 1/set 2 /probe N/FL (28541-28563) 
<400> SEQUENCE: 66 
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117 



-continued 




23 



<211> LENGTH: 25 



67 



<213> ORGANISM: Probe 2/set 2/probe SARS/N/LC705 (28565-28589) 
<400> SEQUENCE: 67 



<210> SEQ ID NO 68 
<211> LENGTH : 30 
<212> TYPE : DNA 

<213> ORGANISM: artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: Anchor primer 14T 

<400> SEQUENCE : 68 

agatgaattc ggtacctttt tttttttttt 



<211> LENGTH: 13 

<213> ORGANISM: artificial sequence 
<223> OTHER INFORMATION : M2-14 peptide 
<400> SEQUENCE: 69 

Ala Asp Asn Gly Thr lie Thr Val Glu Glu Leu Lys Gin 
15 10 



<210> SEQ ID NO 70 
<211> LENGTH: 12 
<212> TYPE : PRT 

<213> ORGANISM: artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: El-12 peptide 
<4 00> SEQUENCE: 70 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu 
15 10 




Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn Leu Asn Ser Ser 
15 10 15 

Glu Gly Val Pro Asp Leu Leu Val 



<211> LENGTH: 153 

<212> TYPE : DNA 

<213> ORGANISM: CORONAVIRUS 

<400> SEQUENCE: 72 



cccgcaatcc 



25 



gatattaggt ttttacctac ccaggaaaag ccaaccaacc tcgatctctt gtagatctgt 



US 2007/0275002 Al Nov. 29, 2007 



tctgtgtagc tgtcgctcgg ctgcatgcct agtgca 



<212> TYPE : D 



<400> 

ttctccagac aacttoaaaa ttcoatgagt ggagcttctg ctgattcaac tcaggcataa 60 

acactcatga tgaccacaca aggcagatgg gctatgtaaa cgttttcgca attccgttta 120 

cgatacatag tctactcttg tgcagaatga attctcgtaa ctaaacagca caagtaggtt 180 

tagttaactt taatctcaca tagcaatctt taatcaatgt gtaacattag ggaggacttg 240 

aaagagccao cacattttca tcgaggccac goggagtacg atcgagggta cagtgaataa 300 

tgctagggag agctgcctat atggaagagc cctaatgtgt aaaattaatt ttagtagtgc 360 

tatccccatg tgattttaat agcttcttag gagaatgaca aaaaaaaaaa 410 

<210> SEQ ID NO 74 



Met Glu Ser Leu Val Leu Gly Val Asn Glu Lys Thr His Val Gin Leu 
15 10 15 

Ser Leu Pro Val Leu Gin Val Arg Asp Val Leu Val Arg Gly Phe Gly 
20 25 30 

Asp Ser Val Glu Glu Ala Leu Ser Glu Ala Arg Glu His Leu Lys Asn 
35 40 45 

Gly Thr Cys Gly Leu Val Glu Leu Glu Lys Gly Val Leu Pro Gin Leu 
50 55 60 

Glu Gin Pro Tyr Val Phe He Lye Arg Ser Asp Ala Leu Ser Thr Asn 
65 70 75 80 

His Gly His Lys Val Val Glu Leu Val Ala Glu Met Asp Gly lie Gin 
85 90 95 

Tyr Gly Arg Ser Gly He Thr Leu Gly Val Leu Val Pro His Val Gly 

Glu Thr Pro He Ala Tyr Arg Asn Val Leu Leu Arg Lys Asn Gly Asn 
115 120 125 

Lys Gly Ala Gly Gly His Ser Tyr Gly He Asp Leu Lys Ser Tyr Asp 
130 135 140 

Leu Gly Asp Glu Leu Gly Thr Asp Pro He Glu Asp Tyr Glu Gin Asn 
145 150 155 160 

Trp Asn Thr Lys His Gly Ser Gly Ala Leu Arg Glu Leu Thr Arg Glu 
165 170 175 

Leu Asn Gly Gly Ala Val Thr Arg Tyr Val Asp Asn Asn Phe Cys Gly 
180 185 190 

Pro Asp Gly Tyr Pro Leu Asp Cys He Lys Asp Phe Leu Ala Arg Ala 



Gly Lys Ser Met Cys Thr Leu Ser Glu Gin Leu Asp Tyr I 
210 215 220 
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-continued 



Lys Arg Gly Val Tyr Cys Cys Arg Asp His Glu His Glu lie Ala Trp 
225 230 235 240 

Phe Thr Glu Arg Ser Asp Lys Ser Tyr Glu His Gin Thr Pro Phe Glu 
245 250 255 

lie Lys Ser Ala Lys Lys Phe Asp Thr Phe Lys Gly Glu Cys Pro Lys 
260 265 ~ 270 

Phe Val Phe Pro Leu Asn Ser Lys Val Lys Val lie Gin Pro Arg Val 
275 280 285 

Glu Lys Lys Lys Thr Glu Gly Phe Met Gly Arg He Arg Ser Val Tyr 
290 295 300 

Pro Val Ala Ser Pro Gin Glu Cys Asn Asn Met His Leu Ser Thr Leu 
305 310 315 320 

Met Lys Cys Asn His Cys Asp Glu Val Ser Trp Gin Thr Cys Asp Phe 
325 330 335 

Leu Lys Ala Thr Cys Glu His Cys Gly Thr Glu Asn Leu Val He Glu 
340 345 350 

Gly Pro Thr Thr Cys Gly Tyr Leu Pro Thr Asn Ala Val Val Lys Met 
355 360 365 

Pro Cys Pro Ala Cys Gin Asp Pro Glu He Gly Pro Glu His Ser Val 
370 375 380 

Ala Asp Tyr His Asn His Ser Asn He Glu Thr Arg Leu Arg Lys Gly 
385 390 395 400 

Gly Arg Thr Arg Cys Phe Gly Gly Cys Val Phe Ala Tyr Val Gly Cys 
405 410 415 

Tyr Asn Lys Arg Ala Tyr Trp Val Pro Arg Ala Ser Ala Asp He Gly 
420 425 430 

Ser Gly His Thr Gly He Thr Gly Asp Asn Val Glu Thr Leu Asn Glu 
435 440 445 

Asp Leu Leu Glu He Leu Ser Arg Glu Arg Val Asn He Asn He Val 
450 455 460 

Gly Asp Phe His Leu Asn Glu Glu Val Ala He He Leu Ala Ser Phe 
465 470 475 480 

Ser Ala Ser Thr Ser Ala Phe He Asp Thr He Lys Ser Leu Asp Tyr 
485 490 495 

Lys Ser Phe Lys Thr He Val Glu Ser Cys Gly Asn Tyr Lys Val Thr 
500 505 510 

Lys Gly Lys Pro Val Lys Gly Ala Trp Asn He Gly Gin Gin Arg Ser 
515 520 525 

Val Leu Thr Pro Leu Cys Gly Phe Pro Ser Gin Ala Ala Gly Val He 
530 535 540 

Arg Ser He Phe Ala Arg Thr Leu Asp Ala Ala Asn His Ser He Pro 
545 550 555 560 

Asp Leu Gin Arg Ala Ala Val Thr He Leu Asp Gly He Ser Glu Gin 
565 570 575 

Ser Leu Arg Leu Val Asp Ala Met Val Tyr Thr Ser Asp Leu Leu Thr 
580 585 590 

Asn Ser Val He He Met Ala Tyr Val Thr Gly Gly Leu Val Gin Gin 
595 600 605 

Thr Ser Gin Trp Leu Ser Asn Leu Leu Gly Thr Thr Val Glu Lys Leu 
610 615 620 
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Arg Pro He Phe Glu Trp He Glu Ala Lys Leu Ser Ala Gly Val Glu 

Phe Leu Lye Asp Ala Trp Glu He Leu Lys Phe Leu He Thr Gly Val 
645 650 655 

Phe Asp He Val Lys Gly Gin He Gin Val Ala Ser Asp Asn He Lys 
660 665 670 

Asp Cys Val Lys Cys Phe He Asp Val Val Asn Lys Ala Leu Glu Met 
675 680 685 

Cys He Asp Gin Val Thr He Ala Gly Ala Lys Leu Arg Ser Leu Asn 
690 695 700 

Leu Gly Glu Val Phe He Ala Gin Ser Lys Gly Leu Tyr Arg Gin Cys 
705 710 715 720 

He Arg Gly Lys Glu Gin Leu Gin Leu Leu Met Pro Leu Lys Ala Pro 
725 730 735 

Lys Glu Val Thr Phe Leu Glu Gly Asp Ser Hie Asp Thr Val Leu Thr 
740 745 750 

Ser Glu Glu Val Val Leu Lys Asn Gly Glu Leu Glu Ala Leu Glu Thr 

Pro Val Asp Ser Phe Thr Asn Gly Ala He Val Gly Thr Pro Val Cys 
770 775 780 

Val Asn Gly Leu Met Leu Leu Glu He Lys Asp Lys Glu Gin Tyr Cys 
785 790 795 800 

Ala Leu Ser Pro Gly Leu Leu Ala Thr Asn Asn Val Phe Arg Leu Lys 
805 810 815 

Gly Gly Ala Pro He Lys Gly Val Thr Phe Gly Glu Asp Thr Val Trp 
820 825 830 

Glu Val Gin Gly Tyr Lys Asn Val Arg He Thr Phe Glu Leu Asp Glu 
835 840 845 

Arg Val Asp Lys Val Leu Asn Glu Lys Cys Ser Val Tyr Thr Val Glu 
S50 855 860 

Ser Gly Thr Glu Val Thr Glu Phe Ala Cys Val Val Ala Glu Ala Val 
865 870 875 880 

Val Lys Thr Leu Gin Pro Val Ser Asp Leu Leu Thr Asn Met Gly He 

Asp Leu Asp Glu Trp Ser Val Ala Thr Phe Tyr Leu Phe Asp Asp Ala 

Gly Glu Glu Asn Phe Ser Ser Arg Met Tyr Cys Ser Phe Tyr Pro Pro 
915 920 925 

Asp Glu Glu Glu Glu Asp Asp Ala Glu Cys Glu Glu Glu Glu He Asp 
930 935 940 

Glu Thr Cys Glu His Glu Tyr Gly Thr Glu Asp Asp Tyr Gin Gly Leu 
945 950 955 960 

Pro Leu Glu Phe Gly Ala Ser Ala Glu Thr Val Arg Val Glu Glu Glu 
965 970 975 

Glu Glu Glu Asp Trp Leu Asp Asp Thr Thr Glu Gin Ser Glu He Glu 
980 985 990 

Pro Glu Pro Glu Pro Thr Pro Glu Glu Pro Val Asn Gin Phe Thr Gly 
995 1000 1005 

Tyr Leu Lys Leu Thr Asp Asn Val Ala He Lys Cys Val Asp He 
1010 1015 1020 
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1030 1035 
j Lys His Gly Gly Gly Val Ala Gly Ala Leu 



Asn Lys Ala Thr Asn Gly Ala 



Asn Leu Asn Ala Gly Glu Asp He Gin Leu Leu Lys Ala Ala Tyr 
1100 1105 1110 

Glu Asn Phe Asn Ser Gin Asp He Leu Leu Ala Pro Leu Leu Ser 
1115 1120 1125 

Ala Gly He Phe Gly Ala Lys Pro Leu Gin Ser Leu Gin Val Cys 
1130 1135 1140 

Val Gin Thr Val Arg Thr Gin Val Tyr He Ala Val Asn Asp Lys 



Gly G 

Gly Asp Val He Thr 
Ser Lys Lys Ala Gly Gly T 



Lys Lys Val Pro Val Asp Glu Tyr He 
1295 1300 



: He Gin Arg Lys Tyr Lys 
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1420 1425 
s Gly Phe Asn Leu Glu Glu Ala Ala Arg Cys Met Arg 



?he His Leu Asp Gly Glu Val Leu Ser Leu Asp Lys Leu 

jeu Leu Ser Leu Arg Glu Val Lys Thr He Lys Val Phe 
1540 1545 

/al Asp Asn Thr Asn Leu His Thr Gin Leu Val Asp Met 

Chr Tyr Gly Gin Gin Phe Gly Pro Thr Tyr Leu Asp Gly 



Phe Phe Val Leu Pro Ser Asp Asp Thr Leu Arg Ser Glu Ala 
.595 1600 1605 



Phe Glu Tyr Tyr His Thr Leu Asp G 



.655 1660 

Pro Ala Leu Gin Glu Ala Tyr Tyr Arg Ala Arg Ala Gly Asp 

sn Phe Cys Ala Leu He Leu Ala Tyr Ser Asn Lys Thr 

.y Glu Leu Gly Asp Val Arg Glu Thr Met Thr His Leu Leu 



s Ala Asn Li 
15 1720 1725 



1 Met Met Ser Ala Pro Pro 
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Ala Glu Tyr L 

Tyr Thr Gly A 
1805 

Lys Glu Thr L 

Ser Glu Tyr L 
1835 

1850 

Vol Thr Tyr T 
1865 

Lys Asp Asn A 
1880 

Thr Gin Pro L 



Ala Ser Phe LyB 



; Leu Gin Gin Gly Thr P 

i Tyr Gin Cys Gly His T 
1810 

l Tyr Arg lie Asp Gly A 
i Gly Pro Val Thr Asp V 
: Thr lie Lys Pro Val S 
Pro Lys L 

) 

'hr Glu Gin P 
885 



Trp His lie 
1970 

Thr Trp Cys 



Asn Gin Ala 



Ala 



Arg 



His He Thr Ala 
815 

i His Leu Thr Lys Met 

. Phe Tyr Lys Glu Thr 
1845 

: Tyr Lys Leu Asp Gly 
1860 

i Asp Gly Tyr Tyr Lys 
1875 

i He Asp Leu Val Pro 
1890 

) Asn Phe Lys Leu Thr 



Ala He Asp Tyr 



Lys Gly Ala Lys Leu 

■160 

.975 y 
Trp Ser 



.990 



a Cys Glu Ser Gin Gin E 



2025 
He Glu 



His Tyr Ser 

0 
5 
0 

Val Asp Thr 

5 

Gin Gly Met 



s Glu Asp Leu M 



Ser He Thr He Lys Lys F 



o y 

Leu Ala Tyr 



a Gin Arg Val Phe 



Nov. 29, 2007 



-continued 



2165 2170 2175 

Ser Val Lys Ser Val Ala Lys Leu Cys Leu Asp Ala Gly lie Asn 

2180 2185 2190 

Tyr Val Lys Ser Pro Lys Phe Ser Lys Leu Phe Thr lie Ala Met 



Trp Leu Leu Leu Leu Ser He Cys Leu Gly Ser Leu He Cys Val 

2210 2215 2220 

Thr Ala Ala Phe Gly Val Leu Leu Ser Asn Phe Gly Ala Pro Ser 

2225 2230 2235 

Tyr Cys Asn Gly Val Arg Glu Leu Tyr Leu Asn Ser Ser Asn Val 

2240 2245 2250 

Thr Thr Met Asp Phe Cys Glu Gly Ser Phe Pro Cys Ser He Cys 

2255 2260 2265 

Leu Ser Gly Leu Asp Ser Leu Asp Ser Tyr Pro Ala Leu Glu Thr 

2270 2275 2280 

He Gin Val Thr He Ser Ser Tyr Lys Leu Asp Leu Thr He Leu 

2285 2290 2295 

Gly Leu Ala Ala Glu Trp Val Leu Ala Tyr Met Leu Phe Thr Lys 



2315 y y 2320 2325 

Gly Tyr Phe Ala Ser His Phe He Ser Asn Ser Trp Leu Met Trp 
2330 2335 2340 

Phe He He Ser He Val Gin Met Ala Pro Val Ser Ala Met Val 
2345 2350 2355 

Arg Met Tyr He Phe Phe Ala Ser Phe Tyr Tyr He Trp Lys Ser 
2360 2365 2370 

Tyr Val His He Met Asp Gly Cys Thr Ser Ser Thr Cys Met Met 
2375 2380 2385 

Cys Tyr Lys Arg Asn Arg Ala Thr Arg Val Glu Cys Thr Thr He 
2390 2395 2400 

Val Asn Gly Met Lys Arg Ser Phe Tyr Val Tyr Ala Asn Gly Gly 
2405 2410 2415 



Arg Gly Phe Cys Lys Thr His Asn Trp Asn Cys Leu Asn Cys Asp 



Thr Phe Cys Thr Gly Ser Thr Phe He Ser Asp Glu Val Ala Arg 
2435 2440 2445 

Asp Leu Ser Leu Gin Phe Lys Arg Pro He Asn Pro Thr Asp Gin 
2450 2455 2460 

Ser Ser Tyr He Val Asp Ser Val Ala Val Lys Asn Gly Ala Leu 
2465 2470 2475 

His Leu Tyr Phe Asp Lys Ala Gly Gin Lys Thr Tyr Glu Arg His 
2480 2485 2490 

Pro Leu Ser His Phe Val Asn Leu Asp Asn Leu Arg Ala Asn Asn 
2495 2500 2505 

Thr Lys Gly Ser Leu Pro He Asn Val He Val Phe Asp Gly Lys 
2510 2515 2520 

Ser Lys Cys Asp Glu Ser Ala Ser Lys Ser Ala Ser Val Tyr Tyr 



Ser Gin Leu Met Cys Gin Pro He Leu Leu Leu Asp Gin Ala Leu 
2540 2545 2550 
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Asp Ala Tyr Val Asp Thr Phe Ser Ala Thr Phe Ser Val Pro Met 
Glu Lys Leu Lys Ala Leu Val Ala Thr Ala His Ser Glu Leu Ala 
Lys Gly Val Ala Leu Asp Gly Val Leu Ser Thr Phe Val Ser Ala 



2640 

Met Leu Thr Tyr Asn L 



2650 

Gly Ala Cys lie Asp Cys A 



Pro Phe Arg 
He Thr Thr 



Leu Met Leu Lys Ala Thr Leu Leu Cys Val Leu Ala 
2750 2755 2760 

Cys Tyr He Val Met Pro Val His Thr Leu Ser He 



Ala Gly Phe Asp Ala Trp Phe Ser Gin Arg Gly Gly Ser Tyr Lys 

Asn Asp Lys Ser Cys Pro Val Val Ala Ala He He Thr Arg Glu 

He Gly Phe He Val Pro Gly Leu Pro Gly Thr Val Leu Arg Ala 

He Asn Gly Asp Phe Leu His Phe Leu Pro Arg Val Phe Ser Ala 
2855 

Sly Asn He C 
2870 

Asp Phe Ala Thr Ser Ala Cys Val Leu Ala Ala Glu Cys 1 
2885 2890 2895 

Phe Lys Asp Ala Met Gly Lys Pro Val Pro Tyr Cys Tyr A 
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c He He Gin 



t Lys Phe Arg Arg Val P 



3170 



1 Thr Phe Ser Thr Phe Glu 
J Asn Lys Glu Met Tyr Leu 
o Leu Thr Gin Tyr Asn Arg 



Tyr Leu Ala Leu Tyr Asn Lys Tyr Lys Tyr Phe Ser Gly Ala 
3185 3190 3195 



His Val He Cys Thr Ala Glu Asp 

3295 P 3300 
7al Gin Ala Gly Asn Val Gin Leu Arg Val He 
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-continued 

3305 3310 3315 

Gly His Ser Met Gin Asn Cys Leu Leu Arg Leu Lys Val Asp Thr 

Ser Asn Pro Lys Thr Pro Lys Tyr Lys Phe Val Arg lie Gin Pro 

Gly Gin Thr Phe Ser Val Leu Ala Cys Tyr Asn Gly Ser Pro Ser 
3350 3355 3360 

Gly Val Tyr Gin Cys Ala Met Arg Pro Asn His Thr He Lys Gly 
3365 3370 3375 

Ser Phe Leu Asn Gly Ser Cys Gly Ser Val Gly Phe Asn He Asp 

Tyr Asp Cys Val Ser Phe Cys Tyr Met His His Met Glu Leu Pro 

3395 340" 

Thr Gly Val His Ala Gly Thr 
3410 341 

Pro Phe Val Asp Arg Gin Thr Ala Gin Ala Ala Gly Thr Asp Thr 



3565 3570 

Ser Leu Phe Phe Phe Val Tyr Glu Asn A 
3580 3585 

Leu Gly He Met Ala He Ala Ala Cys A 



Pro Ser Leu Ala T 

Ala Ser Trp Val M 

Thr Ser Leu Ser Gly Tyr Arg Leu Lys Asp Cys V 

Ser Ala Leu Val Leu Leu He Leu Met Thr Ala A 
3665 3670 3 

Asp Asp Ala Ala Arg Arg Val Trp Thr Leu Met A 
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Tyr Lys Val Tyr Tyr Gly Asn Ala L 
5 3700 

Trp Ala Leu Val He Ser Val Thr S 



i Asp Gin Ala He 



Cys Tyr Phe Gly Leu Phe Cys Leu L 



Thr Leu Gly Val Tyr Asp Tyr Leu V 



a Ser Gin Gly Leu 



a Phe Lys Leu Asn He Lys 



Cys He Lys Val A 



5 3880 3885 

Leu Leu Ser Val Leu Leu Ser Met Gin Gly Ala V 
0 3895 3900 

Arg Leu Cys Glu Glu Met 
5 3910 

He Ala Ser Glu Phe Ser Ser Leu Pro S 
0 3925 3 

Thr Ala Gin Glu Ala Tyr Glu Gin Ala V 



Glu V 



3965 

3980 

3 Ser G 
3995 

t Leu P 
4010 

a He I 
4025 



Glu Asp Lys Arg A 



Lys Leu A 

5 

Asp Gly C 



He Pro Leu Thr Thr Ala Ala Lys 



s Lys Ser Leu Asn Val 
a Met Gin Arg Lys Leu 

1005 

:o Leu Asn He 
135 

il Val Pro Asp 



Tyr Gly Thr Tyr Lys 



Cys Asp Gly Asn Thr Phe Thr Tyr 
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4081 
Leu Ala 



Trp Pro Leu lie Val 
Leu Gin Asn A 
Cys Ala Ala G 

0 
5 

Leu Leu Ser A 

0 

Ser Asp Gly T 



L Ala Leu Arg Gin 



Gin Asp Leu Lys Trp Ala Arg Phe 



Tyr Leu Tyr Phe He 
4205 

Val Leu Gly Ser Leu 
4220 

Ala Thr Glu Val Pro 
4235 

Gly Gly Gin Pro He Thr A 

Thr Gly Thr Gly Gin A 
4280 

Asp Gin Glu Ser Phe G 
4295 

He Asp His P 
Tyr Val Gin I 
Thr Leu Arg A 
Tyr Gly Cys S 



Leu Asn Asn Leu Asn Arg Gly Met 
0 4215 

Thr val Arg Leu Gin Ala Gly Asn 

Ser Thr Val Leu Ser Phe Cys Ala 

Ala Tyr Lys Asp Tyr Leu Ala Ser 

Cys Val Lys Met L 



Thr Val T 



Cys His 
Lys Gly 



4275 



e Cys Asp Leu Lys 



u Asn Gly Phe Ala Val 



> LENGTH: 2695 

> ORGANISM: CORONAVIRUS 



Arg Val Cys Gly Val Ser Ala Ala Arg Leu Thr Pro Cys Gly Thr Gly 
15 10 15 

Thr Ser Thr Asp Val Val Tyr Arg Ala Phe Asp He Tyr Asn Glu Lys 
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Val Ala Gly Phe Ala Lys Phe Leu Lys Thr Asn Cys Cys Arg Phe Gin 
35 40 45 

Glu Lys Asp Glu Glu Gly Asn Leu Leu Asp Ser Tyr Phe Val Val Lys 
50 55 60 

Arg His Thr Met Ser Asn Tyr Gin His Glu Glu Thr lie Tyr Asn Leu 
65 70 75 80 

Val Lys Asp Cys Pro Ala Val Ala Val His Asp Phe Phe Lys Phe Arg 
85 90 95 

Val Asp Gly Asp Met Val Pro His He Ser Arg Gin Arg Leu Thr Lys 
100 105 110 

Tyr Thr Met Ala Asp Leu Val Tyr Ala Leu Arg His Phe Asp Glu Gly 
115 120 125 

Asn Cys Asp Thr Leu Lys Glu He Leu Val Thr Tyr Asn Cys Cys Asp 
130 135 140 

Asp Asp Tyr Phe Asn Lys Lys Asp Trp Tyr Asp Phe Val Glu Asn Pro 
145 150 155 160 

Asp He Leu Arg Val Tyr Ala Asn Leu Gly Glu Arg Val Arg Gin Ser 
165 170 175 

Leu Leu Lys Thr Val Gin Phe Cys Asp Ala Met Arg Asp Ala Gly He 
180 185 190 

Val Gly Val Leu Thr Leu Asp Asn Gin Asp Leu Asn Gly Asn Trp Tyr 
195 200 205 

Asp Phe Gly Asp Phe Val Gin Val Ala Pro Gly Cys Gly Val Pro He 
210 215 220 

Val Asp Ser Tyr Tyr Ser Leu Leu Met Pro He Leu Thr Leu Thr Arg 
225 230 235 240 

Ala Leu Ala Ala Glu Ser His Met Asp Ala Asp Leu Ala Lys Pro Leu 
245 250 255 

He Lys Trp Asp Leu Leu Lys Tyr Asp Phe Thr Glu Glu Arg Leu Cys 
260 265 270 

Leu Phe Asp Arg Tyr Phe Lys Tyr Trp Asp Gin Thr Tyr His Pro Asn 
275 280 285 

Cys He Asn Cys Leu Asp Asp Arg Cys He Leu His Cys Ala Asn Phe 
290 295 300 

Asn Val Leu Phe Ser Thr Val Phe Pro Pro Thr Ser Phe Gly Pro Leu 
305 310 315 320 

Val Arg Lys He Phe Val Asp Gly Val Pro Phe Val Val Ser Thr Gly 
325 330 335 

Tyr His Phe Arg Glu Leu Gly Val Val His Asn Gin Asp Val Asn Leu 

His Ser Ser Arg Leu Ser Phe Lys Glu Leu Leu Val Tyr Ala Ala Asp 
355 360 365 

Pro Ala Met His Ala Ala Ser Gly Asn Leu Leu Leu Asp Lys Arg Thr 
370 375 380 

Thr Cys Phe Ser Val Ala Ala Leu Thr Asn Asn Val Ala Phe Gin Thr 
385 390 395 400 

Val Lys Pro Gly Asn Phe Asn Lys Asp Phe Tyr Asp Phe Ala Val Ser 



Lys Gly Phe Phe Lys Glu Gly Ser Ser Val Glu Leu Lys His E 
420 425 430 
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Phe Ala Gin Asp Gly Asn Ala Ala lie Ser Asp Tyr Asp Tyr Tyr Arg 
435 " 440 445 

Tyr Asn Leu Pro Thr Met Cys Asp lie Arg Gin Leu Leu Phe Val Val 
450 455 460 

Glu Val Val Asp Lys Tyr Phe Asp Cys Tyr Asp Gly Gly Cys lie Asn 
465 470 475 480 

Ala Asn Gin Val He Val Asn Asn Leu Asp Lys Ser Ala Gly Phe Pro 

Phe Asn Lys Trp Gly Lys Ala Arg Leu Tyr Tyr Asp Ser Met Ser Tyr 
500 505 510 

Glu Asp Gin Asp Ala Leu Phe Ala Tyr Thr Lys Arg Asn Val He Pro 
515 520 525 

Thr He Thr Gin Met Asn Leu Lys Tyr Ala He Ser Ala Lys Asn Arg 
530 535 540 

Ala Arg Thr Val Ala Gly Val Ser He Cys Ser Thr Met Thr Asn Arg 
545 550 555 560 

Gin Phe His Gin Lys Leu Leu Lys Ser He Ala Ala Thr Arg Gly Ala 

Thr Val Val He Gly Thr Ser Lys Phe Tyr Gly Gly Trp His Asn Met 
580 585 590 

Leu Lys Thr Val Tyr Ser Asp Val Glu Thr Pro His Leu Met Gly Trp 
595 600 605 

Asp Tyr Pro Lys Cys Asp Arg Ala Met Pro Asn Met Leu Arg He Met 
610 615 620 

Ala Ser Leu Val Leu Ala Arg Lys His Asn Thr Cys Cys Asn Leu Ser 
625 630 635 640 

His Arg Phe Tyr Arg Leu Ala Asn Glu Cys Ala Gin Val Leu Ser Glu 
645 650 655 

Met Val Met Cys Gly Gly Ser Leu Tyr Val Lys Pro Gly Gly Thr Ser 
660 665 670 

Ser Gly Asp Ala Thr Thr Ala Tyr Ala Asn Ser Val Phe Asn He Cys 
675 680 685 

Gin Ala Val Thr Ala Asn Val Asn Ala Leu Leu Ser Thr Asp Gly Asn 
690 695 700 

Lys He Ala Asp Lys Tyr Val Arg Asn Leu Gin His Arg Leu Tyr Glu 
705 710 715 720 

Cys Leu Tyr Arg Asn Arg Asp Val Asp His Glu Phe Val Asp Glu Phe 
725 730 735 

Tyr Ala Tyr Leu Arg Lys His Phe Ser Met Met He Leu Ser Asp Asp 
740 745 750 

Ala Val Val Cys Tyr Asn Ser Asn Tyr Ala Ala Gin Gly Leu Val Ala 
755 760 765 

Ser He Lys Asn Phe Lys Ala Val Leu Tyr Tyr Gin Asn Asn Val Phe 
770 775 780 

Met Ser Glu Ala Lys Cys Trp Thr Glu Thr Asp Leu Thr Lys Gly Pro 
785 790 795 800 

His Glu Phe Cys Ser Gin His Thr Met Leu Val Lys Gin Gly Asp Asp 
805 810 815 

Tyr Val Tyr Leu Pro Tyr Pro Asp Pro Ser Arg He Leu Gly Ala Gly 
820 825 830 

Cys Phe Val Asp Asp He Val Lys Thr Asp Gly Thr Leu Met He Glu 
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a Thr Lys His Pro 



Arg Phe Val Ser Leu Ala lie Asp Ala Tyr 

Asn Gin Glu Tyr Ala Asp Val Phe His Leu Tyr Leu Gin Tyr lie Arg 

Lys Leu His Asp Glu Leu Thr Gly His Met Leu Asp Met Tyr Ser Val 

BBS 890 895 

Met Leu Thr Asn Asp Asn Thr Ser Arg Tyr Trp Glu Pro Glu Phe Tyr 

900 905 910 

Glu Ala Met Tyr Thr Pro His Thr Val Leu Gin Ala Val Gly Ala Cys 

915 920 925 

Val Leu Cys Asn Ser Gin Thr Ser Leu Arg Cys Gly Ala Cys lie Arg 

930 935 940 

Arg Pro Phe Leu Cys Cys Lys Cys Cys Tyr Asp His Val lie Ser Thr 

945 950 955 960 

Ser His Lys Leu Val Leu Ser Val Asn Pro Tyr Val Cys Asn Ala Pro 

965 970 975 

Gly Cys Asp Val Thr Asp Val Thr Gin Leu Tyr Leu Gly Gly Met Ser 



985 



Tyr Tyr Cys Lyt 
Asn Gly Gin Val Phe Gly 



040 



Lys 



Leu Cye Ala 



Lys Leu Phe Ala Ala 

1055 1060 

Lys Leu Ser Tyr Gly He Ala Thr Val Arg Glu 
1070 1075 

Arg Glu Leu His Leu Ser Trp Glu Val Gly Lys 
-"5 1090 

Arg Asn Tyr Val Phe Thr Gly Tyr Arg 



1105 

,ys Val Gin He Gly Glu Tyr T 



1000 1005 

Tyr Lys Asn Thr Cys Val Gly S 



Glu Arg Leu 
Leu Lys Ala Thr Glu Glu Thr Phe 



15 



1120 



Gly Asp Ala Val Val Tyr Arg Gly Thr Thr T 



Val Ala Asn Tyr Gin Lys Val Gly Met Gin Lys 
1195 

Gin Gly Pro Pro Gly Thr Gly Lys Ser His Phe 
1210 

Tyr Tyr Pro Ser Ala Arg He Val Tyr 

1225 



Pro Arg Pro Pro 
' "95 

s Gly Asp Tyr 

'yr Lys Leu Asn 

Val Met Pro Leu 
55 

il Arg He Thr 
70 

Phe Ser Ser Asn 

.185 

Tyr Ser Thr Leu 
He Gly Leu 

1230 
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Cys Glu Lys Ala Leu Lys Tyr Leu 
0 1245 

Pro He Asp Lys Cys Ser Arg He He Pro Ala Arg Ala Arg Val 

Glu Cys Phe Asp Lys Phe Lys Val Asn Ser Thr Leu Glu Gin Tyr 
1265 " " 1270 1275 



Val Val Phe Asp Glu He Ser Met Ala 
1295 1300 



7al Tyr Asp Asn Lys 



Tyr Lys Gly Val He Thr His Asp Val Ser Ser Ala I 

Pro Gin He Gly Val Val Arg Glu Phe Leu Thr Arg i 
1415 1420 1425 



a Thr Ala His Ser Cys Asn 
1475 1480 

r Arg Ala Lys He Gly He 
1490 1495 

n Tyr Asp Lys Leu Gin Phe Thr Ser Leu Glu He Pro Arg A 
1505 1510 1515 

a Val Ala Thr Leu Gin Ala Glu Asn Val Thr Gly Leu Phe L 
1520 1525 1530 

p Cys Ser Lys He He Thr Gly Leu His Pro Thr Gin Ala E 
1535 1540 1545 
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0 

Val Arg lie Lys 



1720 Y 
s Asp Lys 



Tyr Val Tyr Asn Pro Phe 
1775 

Gly Asn Ala His Val Ala 

Leu Ala Val His Glu Cys 

Glu Tyr Pro lie He Gly 

Arg Lys Val Gin His Met 
1835 

Lys Phe Pro Val Leu His 
Cys Val Pro Gin Ala Glu 
Pro Cys Ser Asp Lys Ala 

Trp Asn Cys Asn Val Asp 



1925 



1955 
Phe Tyr Tyr S 



L> Tyr 



ti Ala Lys Pro Pro Pro Gly Asp 
a Met Tyr Lys Gly Leu Pro Trp 
1 Gin Met Leu Ser Asp Thr Leu 



s Asp Gin His Cys Gin Val H 



Phe Val L 
Asp Glu 



c Asp Gly Val Cys Leu Phe 
1905 

3 Ala Asn Ala He Val Cys 

s His Ala Phe His Thr Pro 
1950 

n Leu Lys Gin Leu Pro Phe 



e Asp Tyr Val Pro Leu Lys Ser Ala Thr Cys He 
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Glu Tyr Arg Gin Tyr Leu Asp 



Val Ala Phe Glu Leu Trp Ala Lys Arg Asn He Lys Pro Val 
2105 2110 2115 

Glu He Lys He Leu Asn Asn Leu Gly Val Asp He Ala Ala 
2120 2125 2130 

Thr Val He Trp Asp Tyr Lys Arg Glu Ala Pro Ala His Val 



c Leu Thr Val Leu Phe Asp Gly Arg Val 
2170 2175 



p Leu Phe Arg Asn Ala Arg Asn Gly Val Leu 
2185 2190 

r Val Lys Gly Leu Thr Pro Ser Lys Gly Pro 



r Arg Asp Leu Glu Asp 



2255 

Met Asp Glu Phe He G 



1 Glu He He Lys 
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Val Glu Thr Phe Tyr Pro Lys Leu Gin Ala Ser Gin Ala Trp Gin 

Pro Gly Val Ala Met Pro Asn Leu Tyr Lys Met Gin Arg Met Leu 
2405 2410 2415 

Leu Glu Lys Cys Asp Leu Gin Asn Tyr Gly Glu Asn Ala Val He 

Pro Lys Gly He Met Met Asn Val Ala Lys Tyr Thr Gin Leu Cys 
2435 2440 2445 

Gin Tyr Leu Asn Thr Leu Thr Leu Ala Val Pro Tyr Asn Met Arg 
2450 2455 2460 

Val He His Phe Gly Ala Gly Ser Asp Lys Gly Val Ala Pro Gly 
2465 

Thr Ala Val Leu A 
2480 

Asp Ser Asp Leu A 
2495 

He Gly Asp Cys Ala Thr Val His Thr Ala Asn Lys Trp Asp Leu 
2510 

He He Ser 
2525 

Glu Asn Asp Ser Lys Glu Gly Phe Phe Thr Tyr Leu Cys Gly P 



o He Gin Leu Ser Ser Tyr Ser Leu Phe Asp 
o Leu Lys Leu Arg Gly Thr Ala Val Met Ser 



2645 

^eu Lys Glu Asn Gin He Asn Asp Met He Tyr 



s Arg Glu Asn Asn Arg Val V 
2680 2685 



<210> SEQ ID NO 76 
<211> LENGTH: 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : S/L3/+/4932 primer 
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}N: S/L4/+/6401 primer 
<400> SEQUENCE: 77 
cogaagttgt aggcaatgtc 

<210> SEQ ID NO 78 
<211> LENGTH: 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION : S/L4/+/6964 primer 
<4 00> SEQUENCE: 73 
tttggtgctc cttcttattg 

<210> SEQ ID NO 79 
<211> LENGTH : 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 



<211> LENGTH: 20 
<212> TYPE : DNA 

<220> FEATURE : 

: S/LS/-/ 



tggtcagtag ggttgattgg 



: S/L5/-/8127 primer 
<400> SEQUENCE: 81 



<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: S/L5/-/8633 



<400> SEQUENCE: 82 
gtcacgagtg acaccatcct 
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<211> LENGTH : 20 

<213> ORGANISM : Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION : S/L5/+/7839 primer 



atgcgacgag tctgcttcta 



<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L5/+/8785 primer 



ttcatagtgc ctggcttacc 



<210> SEQ ID NO 85 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

: S/L5/+/8255 p 



<400> SEQUENCE : 85 
atcttggcgc atgtattgac 



<212> TYPE: DNA 
<213> ORGANISM: A 
<220> FEATURE : 



tgcattagca gca 



<212> TYPE : DNA 

<213> ORGANISM: Artificial sequel: 
<220> FEATURE : 

: S/L6/-/ 



<400> SEQUENCE: 87 
tctgcagaac agcagaagtg 



<223> OTHER INFORMATION: S/L6/-/ 10542 priir 
<400> SEQUENCE: 88 
cctgtgcagt ttgtctgtca 

<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
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-continued 



<220> FEATURE: 

<223> OTHER INFORMATION: S/L6/+/10677 primer 



<400> SEQUENCE: 89 



ccttgtggca atgaagtaca 



20 



<210> SEQ ID NO 90 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION : S/L6/+/10106 primer 
<4 00> SEQUENCE: 90 

atgtcatttg cacagcagaa 20 



<210> SEQ ID NO 91 

<211> LENGTH : 20 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: S/L6/+/9571 primer 

<400> SEQUENCE : 91 

cttcaatggt ttgccatgtt 20 



<210> SEQ ID NO 92 
<211> LENGTH: 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: S/L7/-/11271 primer 
<400> SEQUENCE: 92 

tgcgagctgt catgagaata 20 




<220> FEATURE: 

<223> OTHER INFORMATION: S/L7/-/11801 primer 
<400> SEQUENCE : 93 

aaccgagagc agtaccacag 20 



<210> SEQ ID NO 94 

<211> LENGTH: 20 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L7/-/12383 primer 

<400> SEQUENCE: 94 

tttggctgct gtagtcaatg 20 



<211> LENGTH : 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: S/L7/+/12640 primer 



<400> SEQUENCE : 95 
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ctacgacaga tgtcctgtgc 



<223> OTHER INFORMATION: S/L7/+/12088 primer 
<400> SEQUENCE : 96 



<213> ORGANISM: Artificial sequence 

<223> OTHER INFORMATION : S/L7/+/11551 primer 

<400> SEQUENCE: 9 7 

ttaggctatt gttgctgctg 

<210> SEQ ID NO 98 



<223> OTHER INFORMATION: S/L8/-/13160 primer 
<400> SEQUENCE: 98 
cagacaacat gaagcaccac 

<211> LENGTH : 20 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L8/-/ 13704 primer 

<400> SEQUENCE: 99 
cgctgacgtg atatatgtgg 

<210> SEQ ID NO 100 

<211> LENGTH: 20 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: S/L8/-/14284 primer 



<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: S/L8/+/1445 



acatagctcg cgtctcagtt 



20 
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<210> SEQ ID NO 102 

<211> LENGTH : 20 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: S/L8/+/13968 primer 

<400> SEQUENCE: 102 



<211> LENGTH : 19 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L8/+/13401 primer 

<400> SEQUENCE: 103 




<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION : S/L9/-/15098 primer 

<400> SEQUENCE : 104 



<211> LENGTH : 20 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L9/-/15677 primer 

<400> SEQUENCE : 105 



<210> SEQ ID NO 106 

<211> LENGTH : 20 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L9/-/16247 primer 

<400> SEQUENCE: 106 

catggtcata gcagcacttg 20 



<211> LENGTH: 21 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: S/L9/+/16323 primer 

<400> SEQUENCE: 107 



<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
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142 



-continued 



<220> FEATURE: 

<223> OTHER INFORMATION: 




<400> SEQUENCE : 108 



ccttacccag atccatcaag 



20 



<210> SEQ ID NO 109 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : S/L9/+/1528S primer 
<400> SEQUENCE: 109 

cgcaaacata acacttgctg 20 



<210> SEQ ID NO 110 
<211> LENGTH : 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: S/L10/-/16914 primer 
<400> SEQUENCE: 110 

agtgttgggt acaagccagt 20 



<210> SEQ ID NO 111 
<211> LENGTH: 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: S/L10/-/17466 primer 
<400> SEQUENCE : 111 

gttccaagga acatgtctgg 20 




<220> FEATURE : 

<223> OTHER INFORMATION : S/L10/-/18022 primer 
<4 00> SEQUENCE: 112 

aggtgcctgt gtaggatgaa 20 



<210> SEQ ID NO 113 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: S/L10/+/ 18245 primer 
<400> SEQUENCE: 113 

gggctgtcat gcaactagag 20 



<210> SEQ ID NO 114 

<211> LENGTH: 20 

<212> TYPE: DNA 

<220> FEATURE: ? 

<223> OTHER INFORMATION: S/L10/+/17663 primer 

<400> SEQUENCE: 114 
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tcttacacgc aatcctgctt 

<210> SEQ ID NO 115 
<211> LENGTH : 20 
<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION : S/L10/+/170 
<400> SEQUENCE: 115 
tacccatctg ctcgcatagt 



S/L11/-/18877 primer 



caagcagaa ttaaccctce 
210> SEQ ID NO 117 



<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: S/L11/-/1 



S/Lll/-/ 



<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

: S/L11/+/20245 primer 



> SEQUENCE: 119 
acacat cgtttatgga 



5M: Artificial sequence 

> FEATURE: 

> OTHER INFORMATION: S/L11/+/19611 primer 



<400> SEQUENCE : 
gaagcacctg tttcc 
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<210> SEQ ID NO 12 
<211> LENGTH: 20 
<212> TYPE : DNA 
<213> ORGANISM: 
<220> FEATURE : 

<223> OTHER INFORMATION: S/L11/+/19021 primer 
<400> SEQUENCE: 121 
acgatgctca gccatgtagt 



<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: SARS/L1/F3/+/800 primer 



<400> SEQUENCE: 122 
gaggtgcagt cactcgctat 



<211> LENGTH: 



<213> ORGANISM: Artificial sequence 
<220> FEATURE! 

<223> OTHER INFORMATION: SARS/L1/F4/+/1391 primer 
<400> SEQUENCE: 123 
cagagattgg acctgagcat 



<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: SARS/L1/F5/+/192S primer 



: Artificial sequence 
<223> OTHER INFORMATION: SARS/L1/R3/ -/ 



<211> LENGTH : 20 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: SARS/L1/R4/-/1107 primer 



cacgtggttg aatgactttg 
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<400> SEQUENCE: 127 
atttctgcaa ccagctcaac 



<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: SARS/L2/F3/+/2664 p 



cgcattgtct cctggtttac 



■tificial sequence 

SARS/L2/F4/+/3232 p 



gagattgagc cagaaccaga 



: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: SARS/L2/F5/+/ 



<210> SEQ ID NO 131 
<211> LENGTH: 20 
<212=. TYPE : DNA 

<213> ORGANISM: Artificial sequence 



3 INFORMATION: SARS/L2/R3/-/3 



ctgccttaag aagctggat 



<223> OTHER INFORMATION: SARS/L2/R4/-/ 



tttcttcacc agcatcatca 



<210> SEQ ID NO 133 
<211> LENGTH : 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: SARS/L2/R5/-/2529 primer 



<400> SEQUENCE: 133 
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caccgttctt gagaacaacc 

<210> SEQ ID NO 134 
<211> LENGTH: 20 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<223> OTHER INFORMATION: SARS/L3/F3/+/4708 p 
<400> SEQUENCE: 134 
tctttggctg gctcttacag 

<213> ORGANISM: Artificial sequence 
<223> OTHER INFORMATION: SRAS/L3/F4/+/5305 p 
<400> SEQUENCE : 135 
gctggtgatg ctgctaactt 

<210> SEQ ID NO 136 

> ORGAN 

<223> OTHER INFORMATION: SARS/L3/F5/+/5822 primer 
<400> SEQUENCE: 136 
ccatcaagcc tgtgtcgtat 

<210> SEQ ID NO 137 

<211> LENGTH: 20 

<213> ORGANISM: Artificial sequence 

<223> OTHER INFORMATION: SARS/L3/R3/-/5610 primer 

<400> SEQUENCE: 137 



<211> LENGTH: 20 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION : SARS/L3/R4/-/4988 pri 

<400> SEQUENCE : 138 



3N: SARS/L3/R5/-/4 
<400> SEQUENCE: 139 
atcggacacc atagtcaacg 
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<210> SEQ ID NO 140 
<211> LENGTH: 7788 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: synthetic S gene 
<400> SEQUENCE: 140 

aatatgaccg ccatgttggc attgattatt gactagttat taatagtaat caattacggg 180 

gtcattagtt catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc 240 

gcctggctga ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat 300 

agtaacgcca atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc 360 

ccacttggca gtacatcaag tgtatcatat gccaagtccg ccccctattg acgtcaatga 420 

cggtaaatgg cccgcctggc attatgccca gtacatgacc ttacgggact ttcctacttg 480 

gcagtacatc tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacac 54 0 

caatgggcgt ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt 600 

caatgggagt ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaataaccc 660 

cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagc 720 

tcgtttagtg aaccgtcaga tcactagaag ctttattgcg gtagtttatc acagttaaat 780 

tgctaacgca gtcagtgctt ctgacacaac agtctcgaac ttaagctgca gaagttggtc 840 

gtgaggcact gggcaggtaa gtatcaaggt tacaagacag gtttaaggag accaatagaa 900 

actgggcttg tcgagacaga gaagactctt gcgtttctga taggcaccta ttggtcttac 960 

tgacatccac tttgcctttc tctccacagg tgtccactcc cagttcaatt acagctctta 1020 

aggctagagt acttaatacg actcactata ggctagcgga tccaccatgt tcatcttcct 1080 

gctgttcctg accctgacca gcggcagcga cctggaccgg tgcaccacct tcgacgacgt 1140 

gcaggccccc aactacaccc agcacaccag cagcatgcgg ggcgtgtact accccgacga 1200 

gatctttcgg agcgacaccc tgtacctgac ccaggacctg ttcctgccct tctacagcaa 1260 

cgtgaccggc ttccacacca tcaaccacac cttcggcaac cccgtgatcc ccttcaagga 1320 

cggcatctac ttcgccgcca ccgagaagag caacgtggtg cggggctggg tgttcggcag 1380 

caccatgaac aacaagagcc agagcgtgat catcatcaac aacagcacca acgtggtgat 1440 

ccgggcctgc aacttcgagc tgtgcgacaa ccccttcttc gccgtgtcca aacccatggg 150 0 

cacccagacc cacaccatga tcttcgacaa cgccttcaac tgcaccttcg agtacatcag 1560 

cgacgccttc agcctggacg tgagcgagaa gagcggcaac ttcaagcacc tgcgggagtt 1620 

cgtgttcaag aacaaggacg gcttcctgta cgtgtacaag ggctaccagc ccatcgacgt 1680 

ggtgagagac ctgcccagcg gcttcaacac cctgaagccc atcttcaagc tgcccctggg 1740 

catcaacatc accaacttcc gggccatcct gaccgccttt agccctgccc aggacatctg 1800 

gggcaccagc gccgccgcct acttcgtggg ctacctgaag cctaccacct tcatgctgaa 1860 

gtacgacgag aacggcacca tcaccgacgc cgtggactgc agccagaacc ccctggccga 1920 

gctgaagtgc agcgtgaaga gcttcgagat cgacaagggc atctaccaga ccagcaactt 1980 

cagagtggtg cctagcggcg atgtggtgcg gttccccaat atcaccaacc tgtgcccctt 2040 
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cggcgaagtg ttoaacgcca ccaagttccc cagcgtgtac gcctgggago ggaagaagat 2100 

cagoaaotgc gtggccgact acagcgtgct gtacaactoo aoottcttca gcaccttcaa 2160 

gtgotaoggo gtgagcgcoa ccaagctgaa cgacctgtgc ttcagcaacg tgtacgooga 2220 

cagottcgtg gtgaagggcg acgacgtgag acagatcgcc cctggccaga ccggcgtgat 2280 

cgccgactac aactacaagc tgcccgacga cttcatgggc tgcgtgctgg cctggaacac 2340 

cggcaagotg oggocottog agcgggacat ctccaacgtg cccttcagoo ccgacggcaa 2460 

gccctgcacc ccccctgccc tgaactgcta ctggcccctg aacgactacg gcttctacac 2 52 0 

caccaccggc atcggctatc agooctacag agtggtggtg ctgagcttcg agctgctgaa 2580 

cgcccctgcc aoogtgtgcg gccccaagct gagoaccgao ctgatcaaga accagtgcgt 2640 

gaacttcaac ttoaaoggcc tgaocggoac cggcgtgctg acccccagca gcaagcgctt 2700 

ccagcccttc cagcagttcg gccgggatgt gagcgacttc accgacagcg tgcgggaccc 2760 

caagaccagc gagatcctgg acatcagccc ctgcagcttc ggcggcgtgt ccgtgatcac 2820 

ccccggcacc aacgcoagca gogaagtggc cgtgctgtac caggacgtga actgcaccga 2880 

ogtgagoaoo gccatccacg ocgaooagct gacccccgoo tggoggatot acagcaccgg 2940 

gaacaaogtg ttccagacco aggcoggctg cctgatcggc googagcacg tggaoaccag 3000 

ctacgagtgc gacatcccca ttggcgocgg aatctgcgoo agctaccaca ccgtgagcct 3060 

gctgcggagc aocagccaga agtccatcgt ggcctacacc atgagcctgg gcgccgacag 3120 

cagcatcgcc tacagcaaca acaccatcgc catocccacc aacttcagca tctccatcac 3180 

caccgaagtg atgcccgtga goatggccaa gaoaagcgtg gattgcaaca tgtacatctg 3240 

cggcgacagc accgagtgcg ocaacctgct gctgcagtac ggcagottct gcacccagct 3300 

gaacogggcc ctgagcggca tcgccgccga gcaggaccgg aacaccagag aagtgttcgc 3360 

ccaagtgaag cagatgtata agacooccao cctgaagtac ttcgggggct tcaacttctc 3420 

tcagatcctg cccgacccto tgaagcccac oaagogctcc ttcatcgagg acctgctgtt 3480 

caacaaagtg accctggccg aogccggctt tatgaagcag tacggcgagt gcctgggcga 3540 

catcaacgcc cgggacctga tctgcgccca gaagtttaac gggctgaccg tgctgccccc 3600 

cctgctgacc gacgacatga tcgccgccta tacagccgcc ctggtgagcg gcaccgccac 3660 

cgccggotgg accttcggag ccggagccgc cctgcagatc cccttcgcca tgcagatggc 3720 

ctaccggttc aacggoatcg gcgtgaccca gaaogtgctg tacgagaacc agaagcagat 3780 

cgccaaccag ttcaacaagg ccatcagcca gatccaggag agcctgacca caaccagcac 3840 

cgccctgggc aagctgcagg acgtggtgaa ccagaacgcc caggccctga acaccctggt 3900 

gaagcagctg agcagcaact tcggogccat cagctctgtg ctgaacgaca tcctgagcag 3960 

gctggacaaa gtggaggccg aagtgcagat cgaccggctg atcaccggac goctgoagtc 4020 

cotgcagaco taogtgaccc agcagctgat cagagccgco gagatccggg ccagogccaa 4080 

tctggccgcc accaagatga gcgagtgcgt gctgggccag agcaagagag tggacttctg 4140 

cggcaagggc tatcacctga tgagcttcco ccaggccgcc ccccacggcg tggtgttcct 4200 

gcacgtgaoc tacgtgccta gccaggagcg gaacttcacc accgcoccag coatctgcca 4260 

cgagggcaag gcctacttcc cccgggaggg cgtgttogtg tttaacggca ccagctggtt 4320 
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cggcaaotgt gatgtggtga tcggcatcat caataacaoc gtgtacgacc ccctgcagcc 4440 

cgagctggac agcttoaagg aggagotgga caaatacttc aagaaccaca cctcccccga 4500 

cgtggacctg ggcgatatca gcggcatcaa cgcctccgtg gtgaacatcc agaaggagat 4560 

cgacagactg aacgaagtgg ccaagaacct gaacgagagc ctgatcgacc tgcaggagct 4620 

gggcaagtac gagcagtaoa tcaagtggoo ctggtacgtg tggctgggct toatogccgg 4680 

cctgatcgco atcgtgatgg tgaccatoct gctgtgctgc atgaccagct gctgtagctg 4740 

cctgaaaggc gcctgcagct gtggcagctg ctgcaagttc gacgaggacg acagcgagcc 4800 

cgtgctgaag ggcgtgaagc tgcactacac ctgataactc gagaattcac gcgtggtacc 4860 

tctagagtcg acccgggcgg ccgcttcgag cagacatgat aagatacatt gatgagtttg 4920 

gacaaaccac aactagaatg cagtgaaaaa aatgctttat ttgtgaaatt tgtgatgcta 4980 

ttgctttatt tgtaaccatt ataagctgca ataaacaagt taacaacaac aattgcattc 5040 

attttatgtt tcaggttcag ggggagatgt gggaggtttt ttaaagcaag taaaacctct 5100 

acaaatgtgg taaaatcgat aaggatccgg gctggcgtaa tagcgaagag gcccgcaccg 5160 

atogooottc ooaacagttg cgcagcctga atggogaatg gacgogcoot gtagcggcgc 5220 

attaagogog gcgggtgtgg tggttacgcg cagcgtgacc gotaoacttg ccagogcoot 5280 

agcgcccgct cctttcgctt tottoccttc ctttctcgcc acgttcgccg gctttccccg 5340 

tcaagctcta aatcgggggc tccctttagg gttocgattt agagctttac ggcacctcga 5400 

ocgcaaaaaa cttgatttgg gtgatggttc acgtagtggg coatcgccct gatagaoggt 5460 

ttttogooot ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tcoaaactgg 5520 

aacaacacto aaccctatct cggtctattc ttttgattta taagggattt tgccgattto 5580 

ggcctattgg ttaaaaaatg agotgattta acaaatattt aacgcgaatt ttaacaaaat 5640 

attaacgttt acaatttcgc ctgatgcggt attttctcct tacgcatctg tgcggtattt 5700 

cacaccgcat atggtgcact otcagtacaa tctgctctga tgccgcatag ttaagccagc 5760 

cccgacacco gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg 5820 

cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat 5880 

caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca 5940 

tgataataat ggtttcttag acgtcaggtg gcacttttog gggaaatgtg cgcggaaccc 6000 

otatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga caataaccct 6060 

gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat ttccgtgtcg 6120 

cccttattcc cttttttgcg gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg 6180 

tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttaoatc gaactggatc 6240 

tcaacagcgg taagatcctt gagagttttc gcoccgaaga acgttttcca atgatgagca 6300 

cttttaaagt totgctatgt ggcgcggtat tatccogtat tgacgccggg caagagcaao 6360 

tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca gtcacagaaa 6420 

agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata accatgagtg 6480 

ataacactgo ggccaactta cttctgacaa cgatcggagg accgaaggag ctaaccgctt 6540 

ttttgcacaa catgggggat catgtaactc gcottgatcg ttgggaaccg gagctgaatg 6600 
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aagccatacc aaacgacgag cgtgaoacca cgatgcctgt agcaatggca acaacgttgc 6660 

gcaaactatt aactggcgaa ctacttactc tagottccog gcaacaatta atagaotgga 6720 

tggaggcgga taaagttgca ggaccaotto tgcgctcggc ccttccggct ggctggttta 6780 

ttgctgataa atctggagcc ggtgagcgtg ggtctcgcgg tatcattgca gcactggggc 6840 

cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag gcaactatgg 6900 

atgaacgaaa tagacagatc gctgagatag gtgcctoact gattaagcat tggtaactgt 6960 

cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 

ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa ogtgagtttt 

cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga gatccttttt 

ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg gtggtttgtt 

tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc agagcgcaga 

taccaaatac tgtcottcta gtgtagcogt agttaggcca ccacttcaag aactctgtag 

oaccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc agtggcgata 

agtcgtgtct taccgggttg gactoaagac gatagttacc ggataaggcg cagcggtcgg 

gctgaacggg gggttcgtgc acacagooca gcttggagcg aacgacctac acogaactga 

gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga aaggoggaoa 

ggtatccggt aagoggcagg gtoggaacag gagagcgcac gagggagctt ccagggggaa 

tgtgatgctc gtcagggggg cggagcctat ggaaaaacgo cagcaacgcg gcctttttac 

ggttcctggc cttttgctgg ccttttgctc acatggctcg acagatct 

<210> SEQ ID NO 141 
<211> LENGTH: 23 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : SNE-S1 primer 



ggttgggatt atccaaaatg tga 



<210> SEQ ID NO 142 

<211> LENGTH: 24 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 



> SEQUENCE: 142 



<223> OTHER INFORMATION: 
<400> SEQUENCE: 143 
cctctcttgt tcttgctcgc a 
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<211> LENGTH : 21 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : SAR1-AS primer 
<400> SEQUENCE: 144 



<211> LENGTH : 45 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE : 

<223> OTHER INFORMATION: PCR primer 

<400> SEQUENCE: 145 



<211> LENGTH: 37 

<212> TYPE : DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION : PCR primer 

<400> SEQUENCE: 146 



<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer 



ataggatcca ccatgtttat tttcttatta tttcttactc t 



<211> LENGTH: 36 

<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 

<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer 

<400> SEQUENCE: 148 



<210> SEQ ID NO 149 
<211> LENGTH: 13 
<212> TYPE: PRT 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: N- terminal end of SRAS-CoV S protein 
(amino acids 1 to 13) 

<400> SEQUENCE: 149 

Met Phe He Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly 
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<210> SEQ ID NO 150 
<211> LENGTH : 10 
<212> TYPE: PRT 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION: oligopeptide 
<400> SEQUENCE: 150 

Ser Gly Asp Tyr Lys Asp Asp Asp Asp Lys 
1 5 10 




<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: PCR primer 
<400> SEQUENCE : 151 

actagctagc ggatccacca tgttcatctt cctg 34 



<210> SEQ ID NO 152 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<223> OTHER INFORMATION: PCR primer 
<400> SEQUENCE: 152 

agtatccgga cttgatgtac tgctcgtact tgc 33 



<210> SEQ ID NO 153 
<211> LENGTH: 59 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: oligonucleotid 
<400> SEQUENCE: 153 

tatgagcttt tttttttttt tttttttggc atataaatag actcggcgcg ccatctgca 59 



<210> SEQ ID NO 154 
<211> LENGTH: 53 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : oligonucleotid 
<400> SEQUENCE : 154 

gatggcgcgc cgagtctatt tatatgccaa aaaaaaaaaa aaaaaaaagc tea 53 




<400> SEQUENCE: 155 



ataegtaega ccatgtttat tttcttatta tttcttactc tcact 45 



<210> SEQ ID NO 156 
<211> LENGTH: 40 
<212> TYPE: DNA 
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-continued 



<213> ORGANISM: Artificial sequence 
<220> FEATURE : 

<223> OTHER INFORMATION : PCR primer 
<400> SEQUENCE: 156 

atagcgcgct cattatgtgt aatgtaattt gac 




> SEQ ID NO 158 



<400> SEQUENCE: 158 

ataggatccg cgcgctcatt atttatcgtc 



1. An isolated and purified protein or polypeptide, char- 
acterized in that it is the S protein having the sequence SEQ 
ID No: 3, its ectodomain or a fragment of its ectodomain. 

2. The protein or polypeptide as claimed in claim 1, 
characterized in that it consists of the amino acids corre- 
sponding to positions 1 to 1193 of the amino acid sequence 
of the S protein. 

3. The protein or polypeptide as claimed in claim 1, 
characterized in that it consists of the amino acids corre- 
sponding to positions 14 to 1 193 of the amino acid sequence 
of the S protein. 

4. The isolated protein or polypeptide as claimed in claim 
1, characterized in that it consists of the amino acids 
corresponding to positions 475 to 1193 of the amino acid 
sequence of the S protein. 

5. A nucleic acid encoding a protein or a polypeptide as 
claimed in any one of claims 1 to 4. 

6. The nucleic acid as claimed in claim 5. characterized in 
that it comprises the sequence encoding SEQ ID No: 5 or the 
sequence encoding SEQ ID No: 6. 

7. A recombinant expression vector, characterized in that 
it encodes a protein or a polypeptide as claimed in any one 
of claims 1 to 4. 

8. The recombinant expression vector as claimed in claim 
7, characterized in that it is chosen from the vectors con- 
tained in the following bacterial strains, deposited at the 
Collection Nationale de Cultures de Microorganismes 
(CNCM), 25 rue du Docteur Roux, 75724 Paris Cedex 15: 

a) strain No. 1-3118, deposited on Oct. 23, 2003, 

b) strain No. 1-3019, deposited on May 12, 2003, 

c) strain No. 1-3020, deposited on May 12, 2003, 

d) strain No. 1-3059, deposited on Jun. 20, 2003. 



e) strain No. 1-3323, deposited on Nov. 22, 2004, 

f) strain No. 1-3324, deposited on Nov. 22, 2004, 



g) str 


ain No. 1-3326, deposited o 


nDec. 1, 


2004, 


h) str 


ain No. 1-3327, deposited o 




2004, 


i)stn 


an No. 1-3332, deposited oi 


l Dec. 1, 


2004, 


j) str; 


lin No. 1-3333, deposited oi 


l Dec. 1, 


2004, 


k) str 


ain No. 1-3334, deposited o 


nDec. 1, 


2004, 


1) str. 


lin No. 1-3335, deposited oi 


l Dec. 1, 


2004, 


m)st 


rain No. 1-3336, deposited i 


m Dec. 1 


, 2004, 


n)stx 


ain No. 1-3337, deposited o 


n Dec. 1, 


2004, 


o) str 


ain No. 1-3338, deposited on Dec. 2, 


2004, 


P) str 


ain No. 1-3339, deposited on Dec. 2, 


2004, 


q)str 


ain No. 1-3340, deposited o 


n Dec. 2, 


2004, 


r) stn 


ain No. 1-3341, deposited oj 


a Dec. 2, 2004. 



9. A nucleic acid containing a synthetic gene allowing 
optimized expression of the S protein in eukaryotic cells, 
characterized in that it possesses the sequence SEQ ID No: 
140. 

10. An expression vector containing a nucleic acid as 
claimed in claim 9, characterized in that it is contained in the 
bacterial strain deposited at the CNCM, on Dec. 1, 2004, 
under the No. 1-3333. 

11. The expression vector as claimed in claim 7 or claim 
9, characterized in that it is a viral vector, in the form of a 
viral particle or in the form of a recombinant genome. 

12. The vector as claimed in claim 11, characterized in 
that it is a recombinant viral particle or a recombinant viral 
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genome capable of being obtained by transfecting a plasmid 
according to paragraphs g), h) or k) to r) of claim 8, into an 
appropriate cellular system. 

13. A lentiviral vector encoding a polypeptide as claimed 
in any one of claims 1 to 4. 

14. A recombinant measles virus encoding a polypeptide 
as claimed in any one of claims 1 to 4. 

15. A recombinant vaccinia virus encoding a polypeptide 
as claimed in any one of claims 1 to 4. 

16. The use of a vector according to paragraphs d) to p) 
of claim 8, or of a vector as claimed in claim 10, for the 
production, in a eukaryotic system, of the SARS-associated 
coronavirus S protein or of a fragment of this protein. 

17. A method for producing the S protein in a eukaryotic 
system, comprising a step of transfecting eukaryotic cells in 
culture with a vector chosen from the vectors contained in 
the bacterial strains mentioned in paragraphs d) to p) of 
claim 8, or in claim 10. 

18. A genetically modified eukaryotic cell expressing a 
protein or a polypeptide as claimed in any one of claims 1 
to 4. 

19. The cell as claimed in claim 18, capable of being 
obtained by transfection with any one of the vectors men- 
tioned in paragraphs k) to n) of claim 8. 

20. The cell as claimed in claim 19, characterized in that 
it is the cell FRhK4-Ssol-30, deposited at the CNCM on 
Nov. 22, 2004, under the No. 1-3325. 

21. A monoclonal antibody recognizing the native S 
protein of a SARS-associated coronavirus. 

22. The use of a protein or a polypeptide as claimed in any 
one of claims 1 to 4, or of an antibody as claimed in claim 
21, for detecting a SARS-associated coronavirus infection, 
from a biological sample. 

23. A method for detecting a SARS-associated coronavi- 
rus, from a biological sample, characterized in that the 
detection is carried out by EL1SA using the recombinant S 



protein or its ectodomain, or a fragment of its ectodomain, 
expressed in a eukaryotic system. 

24. The method of detection as claimed in claim 23, 
additionally comprising a step of detection by ELISA using 
the recombinant N protein. 

25. The method as claimed in claim 23 or 24, character- 
ized in that it is a double epitope ELISA method, and in that 
the serum to be tested is mixed with the visualizing antigen, 
said mixture then being brought into contact with the antigen 
attached to a solid support. 

26. An immune complex formed of a monoclonal anti- 
body or antibody fragment as claimed in claim 21, and of a 
SARS-associated coronavirus protein or peptide 

27. An immune complex formed of a protein or a polypep- 
tide as claimed in any one of claims 1 to 4, and of an 
antibody directed specifically against an epitope of the 
SARS-associated coronavirus. 

28. A SARS-associated coronavirus detection kit or box, 
characterized in that it comprises at least one reagent 
selected from the group consisting of: a protein or polypep- 
tide as claimed in any one of claims 1 to 4, a nucleic acid as 
claimed in either of claims 5 and 6, a cell as claimed in any 
one of claims 18 to 20, or an antibody as claimed in claim 
21. 

29. An immunogenic and/or vaccine composition, char- 
acterized in that it comprises a recombinant protein or 
polypeptide as claimed in any one of claims 1 to 4, obtained 
in a eukaryotic expression system. 

30. An immunogenic and/or vaccine composition, char- 
acterized in that it comprises a recombinant vector or virus 
as claimed in any one of claims 7, 8, and 10 to 15. 

31. A nucleic acid insert of viral origin, characterized in 
that it is contained in any one of the strains mentioned in 
paragraphs a) to h) and k) to r) of claim 8. 



